Docker+openvswitch+dpdk小实验

前后弄了一个多月,总算弄出来了Docker + OpenvSwitch + dpdk的VNF实验环境。

可用版本

  • Ubuntu kernel版本 4.4.0-131-generic
  • ubuntu系统版本16.04.5
  • dpdk版本16.11.1
  • openvswitch版本2.6.1
  • pktgen版本3.1.1

clear.sh

防止冲突,先把一些设置清空

1
2
3
sudo rm /usr/local/etc/openvswitch/*
sudo rm /usr/local/var/run/openvswitch/*
sudo rm /usr/local/var/log/openvswitch/ovs-vswitchd.log

拓扑结构

大致的结构是这样的

dpdk安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
wget http://fast.dpdk.org/rel/dpdk-16.11.1.tar.xz
tar xf dpdk-16.11.1.tar.xz
cd dpdk-stable-16.11.1

#设置DPDK库目录位置
echo export RTE_SDK=$(pwd) >> ~/.bashrc
#设置DPDK目标环境
#注意!这里的x86_64-native-linuxapp-gcc应替换为实际运行环境
echo export RTE_TARGET=x86_64-native-linuxapp-gcc >> ~/.bashrc
source ~/.bashrc

#配置DPDK,需要使用Vhost-user驱动,需要将CONFIG_RTE_LIBRTE_VHOST=y
vim config/common_base

#安装dpdk
make config T=$RTE_TARGET
make T=$RTE_TARGET -j8
#install是必须的!原博客就没有install
make install T=$RTE_TARGET -j8

#可以编译l2fwd来看一下
cd examples/l2fwd/
make

Hugepage配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
sudo vim /etc/default/grub
#找到其中一项为 GRUB_CMDLINE_LINUX_DEFAULT= ,不论后面的引号内包含任何内容,在原本内容之后添加 default_hugepagesz=1GB hugepagesz=1G hugepages=8(这里分配了8个1G的hugepages)
sudo update-grub
sudo reboot

#查看分配情况
grep Huge /proc/meminfo

#分配成功后进行挂载
sudo mkdir -p /dev/hugepages
sudo mount -t hugetlbfs none /dev/hugepages
sudo mkdir -p /mnt/huge
sudo mount -t hugetlbfs -o pagesize=1G none /mnt/huge
#注:在服务器上是已经配置好了mount到了/mnt/huge_1GB上,所以并没有执行上面的操作

pktgen安装

1
2
3
4
5
6
7
8
#某个版本第一次安装会失败,这个版本不会
#然而可以知道好像是多make几次就会好(从Clayne Robison写的注释来看)
sudo apt install libpcap-dev
wget http://dpdk.org/browse/apps/pktgen-dpdk/snapshot/pktgen-dpdk-pktgen-3.1.1.tar.gz
tar xzf pktgen-dpdk-pktgen-3.1.1.tar.gz
cd pktgen-dpdk-pktgen-3.1.1
make -j8
ln -s $(pwd)/app/$RTE_TARGET/pktgen /usr/bin/pktgen

OvS安装与配置

OvS安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#创建必要目录
mkdir -p /usr/local/etc/openvswitch
mkdir -p /usr/local/var/run/openvswitch

#下载
wget http://openvswitch.org/releases/openvswitch-2.6.1.tar.gz

#解压
tar xzvf openvswitch-2.6.1.tar.gz
cd openvswitch-2.6.1

#配置环境并安装
./boot.sh
CFLAGS='-march=native' ./configure --with-dpdk=$RTE_SDK/$RTE_TARGET

make -j8
sudo make install -j8

启动OvS

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#此时在openvswitch-2.6.1目录下

#init new ovs database
sudo ovsdb-tool create /usr/local/etc/openvswitch/conf.db ./vswitchd/vswitch.ovsschema

#start database server
sudo ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
--remote=db:Open_vSwitch,Open_vSwitch,manager_options \
--pidfile --detach

#initializing the OVS database"
sudo ovs-vsctl --no-wait init

#configure ovs dpdk
sudo ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true \
other_config:dpdk-lcore-mask=0x00f0f0 other_config:dpdk-socket-mem="1024,1024"

#start ovs
#--log-file指定了守护进程ovs-vswitchd的log位置
sudo ovs-vswitchd unix:/usr/local/var/run/openvswitch/db.sock --log-file=/usr/local/var/log/openvswitch/ovs-vswitchd.log --pidfile --detach

其中一些选项的说明如下

  • dpdk-init
    • Specifies whether OVS should initialize and support DPDK ports. This field can either be true or try. A value of true will cause the ovs-vswitchd process to abort on initialization failure. A value of try will imply that the ovs-vswitchd process should continue running even if the EAL initialization fails.
  • dpdk-lcore-mask
    • Specifies the CPU cores on which dpdk lcore threads should be spawned and expects hex string (eg ‘0x123’).
  • dpdk-socket-mem
    • Comma separated list of memory to pre-allocate from hugepages on specific sockets. If not specified, 1024 MB will be set for each numa node by default.上面的是"1024,1024",是因为机器有两个numa node,如果只有一个就是"1024"
  • dpdk-hugepage-dir
    • Directory where hugetlbfs is mounted
  • vhost-sock-dir
    • Option to set the path to the vhost-user unix socket files.

创建OvS port

1
2
3
4
5
6
7
8
9
#ovs use core 2 for the PMD
sudo ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x0f0f00

#create br0 and vhost ports which use dpdk
sudo ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
sudo ovs-vsctl add-port br0 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuser
sudo ovs-vsctl add-port br0 vhost-user2 -- set Interface vhost-user2 type=dpdkvhostuser
sudo ovs-vsctl add-port br0 vhost-user3 -- set Interface vhost-user3 type=dpdkvhostuser
sudo ovs-vsctl add-port br0 vhost-user4 -- set Interface vhost-user4 type=dpdkvhostuser

增加流表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#清理之前可能残存的流表
sudo ./utilities/ovs-ofctl del-flows br0

# 2 <-> 3之间流量
sudo ./utilities/ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,idle_timeout=0,action=output:3
sudo ./utilities/ovs-ofctl add-flow br0 in_port=3,dl_type=0x800,idle_timeout=0,action=output:2

# 1 <-> 4之间流量
sudo ./utilities/ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,idle_timeout=0,action=output:4
sudo ./utilities/ovs-ofctl add-flow br0 in_port=4,dl_type=0x800,idle_timeout=0,action=output:1

#show current flows
sudo ./utilities/ovs-ofctl dump-flows br0

#showing vhost-user sockets in /usr/local/var/run/openvswitch
sudo ls -la /usr/local/var/run/openvswitch | grep vhost-user

OvS从这里就已经配置好了,接下来建立Docker container

创建testpmd container和pktgen container

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#testpmd container
cd $RTE_SDK/../

vim Dockerfile
#Dockerfile内容如下
FROM ubuntu:16.04
RUN apt-get update -y
RUN apt-get install -y numactl
WORKDIR /root/dpdk
COPY dpdk-stable-16.11.1 /root/dpdk/.
ENV PATH "$PATH:/root/dpdk/x86_64-native-linuxapp-gcc/app/"

sudo docker build -t dpdk-docker:17.05 .

#这里进的实际上是一个目录,因为我放到一起了,所以两个Dockerfile会重名,把上一个build完之后改名就好了
cd pktgen-3.4.2/..
vim Dockerfile
#Dockerfile内容如下
FROM ubuntu:16.04
RUN apt-get update -y
RUN apt-get install -y numactl libpcap-dev
WORKDIR /root/dpdk
COPY dpdk-stable-16.11.1 /root/dpdk/.
COPY pktgen-dpdk-pktgen-3.1.1 /root/pktgen/.
RUN ln -s /root/pktgen/app/x86_64-native-linuxapp-gcc/pktgen /usr/bin/pktgen
RUN ln -s /usr/lib/x86_64-linux-gnu/libpcap.so /usr/lib/x86_64-linux-gnu/libpcap.so.1
ENV PATH "$PATH:/root/dpdk/x86_64-native-linuxapp-gcc/app/"

sudo docker build -t pktgen-docker .

#可以查看docker image
sudo docker images

启动Docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#启动pktgen-docker
sudo docker run -ti --rm --privileged --name=pktgen-docker -v /mnt/huge_1GB:/mnt/huge -v /usr/local/var/run/openvswitch:/var/run/openvswitch pktgen-docker:latest

#pktgen应该在pktgen目录下运行,即有Pktgen.lua的目录下运行
cd ../pktgen

#在Docker中运行pktgen
#参数我就没改了
#-c 0x19: DPDK can run on core 0,3-4: (0b0001 1001)
#--master-lcore 3: make the pktgen dpdk thread run on core 3 (0b1000)
#-n 1: we only have one memory bank in this VM
#--file-prefix pktgen: "pktgen" will be appended to hugepage memory files used by this process
#--no-pci don't look for any PCI devices
#--vdev 'virtio_user1,mac=00:00:00:00:00:01,path=/var/run/openvswitch/vhost-user1'
#--vdev 'virtio_user2,mac=00:00:00:00:00:02,path=/var/run/openvswitch/vhost-user2'
#-P: Promiscuous mode
#-T: Color terminal output
#-m "0.0,4.1" (core.port): core 0: port 0 rx/tx; core 4: port 1 rx/tx
#注:-m选项一定要和前面dpdk的-c选项符合
./app/app/x86_64-native-linuxapp-gcc/pktgen -c 0x19 --master-lcore 3 -n 1 --socket-mem 1024,1024 --file-prefix pktgen --no-pci \
--vdev 'net_virtio_user1,mac=00:00:00:00:00:01,path=/var/run/openvswitch/vhost-user1' \
--vdev 'net_virtio_user2,mac=00:00:00:00:00:02,path=/var/run/openvswitch/vhost-user2' \
-- -T -P -m "0.0,4.1"


#启动dpdk-docker
sudo docker run -it --rm --privileged --name=dpdk-docker \
-v /mnt/huge_1GB:/mnt/huge -v /usr/local/var/run/openvswitch:/var/run/openvswitch \
dpdk-docker:17.05

#在Docker中运行dpdk
#-c 0xE0: DPDK can run on core 5-7: (0b1110 0000)
#--master-lcore 5: make the make the master testpmd thread run on core 5 (0b0010 0000)
#-n 1: we only have one memory bank in this VM
#--file-prefix testpmd: "testpmd" will be appended to hugepage memory files used by this process
#--no-pci don't look for any PCI devices
#--vdev=net_virtio_user3,mac=00:00:00:00:00:03,path=/var/run/openvswitch/vhost-user3
#--vdev=net_virtio_user4,mac=00:00:00:00:00:04,path=/var/run/openvswitch/vhost-user4:
# use a virtual
# device using the net_virtio_user driver, MAC address 00:00:00:00:00:03, and the path to the
# unix socket is /var/run/openvswitch/vhost-user3
testpmd -c 0xE0 -n 1 --socket-mem 1024,1024 --file-prefix testpmd --no-pci \
--vdev 'net_virtio_user3,mac=00:00:00:00:00:03,path=/var/run/openvswitch/vhost-user3' \
--vdev 'net_virtio_user4,mac=00:00:00:00:00:04,path=/var/run/openvswitch/vhost-user4' \
-- -i --burst=64 --disable-hw-vlan --txd=2048 --rxd=2048 --auto-start --coremask=0xc0

最后结果

pktgen上显示结果

testpmd上显示结果

配置显示结果

踩坑实录

  • 因为不是在虚拟机上跑,没有打2M -singlefile的patch,所以选项改了一下

  • 理论上dpdk的选项--socket-mem-m应该都是可以的,然而在我mac上的vagrant开起来的虚拟机上,--socket-mem居然不行。

  • 脱离版本的配置十分痛苦。

  • 出现过ovs在add port的时候后报错,事实证明有了log file之后就好调多了

    • could not add network device vhost-user0 to ofproto (No such device).猜测可能是某个东西和linux kernel版本不兼容,因为换了内核就没再出现过了。当时的解决方法是先添加,再删除,再添加。然而最后因为没有产生traffic,所以不知道这种方法是否是对的。
    • could not allocate memery。大概是这个错误。是因为我开启ovs-vswitchd的时候,设置的socket memory是“1024”,而机器有两个numa node,另外一个就没有memory分配了。
  • 没有traffic的情况

    • 有可能是numa的pmd的core分配不对,导致有的socket没有可用的pmd thread
    • 有可能是dpdk的-c设置和testpmd或者pktgen的coremask设置不统一,即coremask使用到的内核并不在-c设置的内核中。

参考资料

  • https://github.com/intel/SDN-NFV-Hands-on-Samples/tree/master/DPDK_in_Containers_Hands-on_Lab/dpdk-container-lab
  • https://www.youtube.com/watch?v=hEmvd7ZjkFw&index=1&list=PLg-UKERBljNx44Q68QfQcYsza-fV0ARbp
  • https://blog.csdn.net/me_blue/article/details/78589592
  • http://blog.sina.com.cn/s/blog_da4487c40102v2ic.html
  • http://docs.openvswitch.org/en/latest/intro/install/dpdk/
  • 其他的具体错误都是搜索谷歌解决
  • 以及--log-file的问题是咨询学长解决