参考视频:
http://www.youtube.com/watch?v=vjRJ8d_0Jwo
Monday, July 22, 2013
Tuesday, July 16, 2013
Sunday, July 07, 2013
Secret of Earning an Engineering Phd Degree
Created at July 8, 2013
Engineering is always decoupling things, making it as simple as possible with a run-able solution.
However, research is mostly of a different style.
Not only to show how it works, but also to know why.
Abstract into theory, and demonstrate with scenarios.
Hence, the secret of earning an engineering Phd degree is to master both, and keep clear to pick the correct one at correct time.
Updated at Sep 18, 2013
So what exactly a Phd should be?
I think Phd = Dreamer + Engineer + Artist + Orator.
Engineering is always decoupling things, making it as simple as possible with a run-able solution.
However, research is mostly of a different style.
Not only to show how it works, but also to know why.
Abstract into theory, and demonstrate with scenarios.
Hence, the secret of earning an engineering Phd degree is to master both, and keep clear to pick the correct one at correct time.
Updated at Sep 18, 2013
So what exactly a Phd should be?
I think Phd = Dreamer + Engineer + Artist + Orator.
Thursday, June 20, 2013
OpenStack VM DHCP problem with Quantum? Guideline and real case
When
using OpenStack in practical scenarios, there will be numbers of detailed
evils. One notorious bug is that booted vm sometimes cannot get an IP by DHCP
automatically. Many people encountered similar problems, and proposed several
solutions, including restarting quantum related services. However, this may
work for some special cases, while fail on the others.
So,
how to find out the crime culprit for your specified problem? In this article,
we will show the guideline to locate the DHCP failure reason and demonstrate
with a real case.
Debug Guideline:
0)
Start a DHCP request in the vm using
sudo udhcpc
or
other dhcp client.
1)
Does the DHCP request reach the network node?
If
not, then you should use tcpdump to capture packets at the compute node’s and
the network node’s network interface (at the data network). A DHCP request usually
looks like
IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP,
Request from fa:16:3e:82:ee:fe, length 286
if
using the following commands.
tcpdump -ni eth1 port 67 and port 68
2)
If the DHCP request successfully reaches the network nodes, then make sure the
quantum-dhcp-agent offers reply. This can be validated through the log file (/var/log/syslog),
or by tcpdump also.
The
log may look like
Jun 21 10:42:31 localhost, dnsmasq-dhcp[541]:
DHCPREQUEST(tap9c753e61-fc) 50.50.1.6 fa:16:3e:82:ee:fe
Jun 21 10:42:31 localhost, dnsmasq-dhcp[541]:
DHCPACK(tap9c753e61-fc) 50.50.1.6 fa:16:3e:82:ee:fe 50-50-1-6
And
a DHCP Reply usually looks like
IP 50.50.1.3.67 > 50.50.1.7.68: BOOTP/DHCP,
Reply, length 308
If
not, make sure the quantum-* services starts successfully at the network node.
service quantum-dhcp-agent status
3)
Make sure the DHCP reply goes back to the compute node using tcpdump too.
4)
If the DHCP reply reach the compute node, then capture at the vm’s
corresponding tap-* network interface, to make sure the reply can reach vm.
If
not, then try to check the quantum-plugin-openvswitch-agent services works fine
at the compute node.
service quantum-plugin-openvswitch-agent status
5)
Sometimes, you may need to restart the whole nodes if problem continues appear
at a special machine.
A real case
I
have met a weird case.
In
the case, everything seems OK. The network node gets the DHCP request and gives
back the offer, while the compute node successfully gets the DHCP offer. However,
the vm still cannot get IP some times, while occasionally it will get one!
I
look very carefully the entire process, and make sure all services are started.
Then
the only suspicious component is the OpenvSwitch.
I
check the of rules at the br-int (vm’s located bridge) using
ovs-ofctl dump-flows br-int
and
they looks like:
NXST_FLOW reply (xid=0x4):
cookie=0x0,
duration=2219.925s, table=0, n_packets=0, n_bytes=85038, idle_age=3,
priority=3,in_port=1,dl_vlan=2 actions=mod_vlan_vid:1,NORMAL
cookie=0x0,
duration=2231.487s, table=0, n_packets=0, n_bytes=120021, idle_age=3,
priority=1 actions=NORMAL
cookie=0x0,
duration=2227.341s, table=0, n_packets=0, n_bytes=16868, idle_age=5,
priority=2,in_port=1 actions=drop
They
look quite normal, as all the rules are generated by the quantum-plugin-openvswitch-agent service.
I also make sure the DHCP offer
reach br-int with capturing packet at it’s data network interface.
tcpdump –ni int-br-eth1 port 67 or port 68
As I guess, the DHCP offer
should match rule#1 (vlan mode), and send out. However, watch a while, the
n_packets does not increase, which means the DHCP offer does not match the
rule.
It is strange right? Why ovs
does not work as expected?
Based on my years’ experience
on ovs, I think there must be some HIDDEN rule destroying the processing. Then I
check more details of the rules.
ovs-appctl bridge/dump-flows br-int
HAHA,
some thing now is floating outside.
duration=151s, priority=180001, n_packets=0,
n_bytes=0, priority=180001,arp,dl_dst=fe:86:a7:fd:c0:4f,arp_op=2,actions=NORMAL
duration=151s, priority=180003, n_packets=0,
n_bytes=0, priority=180003,arp,dl_dst=00:1a:64:99:f2:72,arp_op=2,actions=NORMAL
duration=148s, priority=3, n_packets=0, n_bytes=0,
priority=3,in_port=1,dl_vlan=2,actions=mod_vlan_vid:1,NORMAL
duration=151s, priority=180006, n_packets=0,
n_bytes=0, priority=180006,arp,nw_src=10.0.1.197,arp_op=1,actions=NORMAL
duration=151s, priority=180004, n_packets=0,
n_bytes=0, priority=180004,arp,dl_src=00:1a:64:99:f2:72,arp_op=1,actions=NORMAL
duration=151s, priority=180002, n_packets=0,
n_bytes=0, priority=180002,arp,dl_src=fe:86:a7:fd:c0:4f,arp_op=1,actions=NORMAL
duration=151s, priority=15790320,
n_packets=174, n_bytes=36869, priority=15790320,actions=NORMAL
duration=151s, priority=180005, n_packets=0,
n_bytes=0, priority=180005,arp,nw_dst=10.0.1.197,arp_op=2,actions=NORMAL
duration=151s, priority=180008, n_packets=0,
n_bytes=0, priority=180008,tcp,nw_src=10.0.1.197,tp_src=6633,actions=NORMAL
duration=151s, priority=180007, n_packets=0,
n_bytes=0, priority=180007,tcp,nw_dst=10.0.1.197,tp_dst=6633,actions=NORMAL
duration=151s, priority=180000, n_packets=0,
n_bytes=0, priority=180000,udp,in_port=65534,dl_src=fe:86:a7:fd:c0:4f,tp_src=68,tp_dst=67,actions=NORMAL
table_id=254, duration=165s, priority=0,
n_packets=13, n_bytes=2146,
priority=0,reg0=0x1,actions=controller(reason=no_match)
table_id=254, duration=165s, priority=0,
n_packets=0, n_bytes=0, priority=0,reg0=0x2,drop
See
that? Packets are matching the red rule, which owns a high priority and just
forward the vlan packet as NORMAL!!
So
where does the rule come from?
In
some version of ovs, when we start ovs without any controller specified, then it
may smartly works like a L2 switch, and some rules will be added automatically.
Now
how to solve the problem?
We
need to tell the ovs do not be that “Smart” with the commands:
ovs-vsctl set bridge br-int fail-mode=secure
At
last, the problem has puzzled our team for several weeks. During solving the
problem, I summarize the guideline and wish it would be a little bit helpful.
Monday, May 27, 2013
OpenStack中配置Floodlight作为网络后端插件
OpenStack中Quantum是网络部件,真正的网络逻辑功能则由各个后端插件完成。
下面讲使用Floodlight作为网络后端,该如何具体配置。
下面讲使用Floodlight作为网络后端,该如何具体配置。
计算节点:配置ovs的控制器
首先所有的OpenvSwitch必须配置上控制器信息,可以在所有的nova-compute节点上利用如下的脚本来完成。
NETWORK_CONTROLERS=<comma-seperated-list-of-network-ctrls>
sudo ovs-vsctl \--no-wait \-\-
\--if-exists del-br br-int
sudo ovs-vsctl \--no-wait add-br
br-int
sudo ovs-vsctl \--no-wait
br-set-external-id br-int bridge-id br-int
for ctrl in `echo ${NETWORK_CONTROLERS}
\| tr ',' ' '`
do
sudo ovs-vsctl set-controller br-int
"tcp:${ctrl}:6633"
done
关闭quantum-plugin-openvswitch-agent服务。
service quantum-plugin-openvswitch-agent
stop;
网络节点:关闭冲突的服务。
service quantum-l3-agent stop;
控制节点:创建数据库、安装新插件,更新quantum的plugin配置
1、首先必须安装了MySQL,并创建restproxy_quantum表。
$ mysql -u root -p$PASS -e 'DROP
DATABASE IF EXISTS restproxy_quantum;'
$ mysql -u root -p$PASS -e 'CREATE
DATABASE IF NOT EXISTS restproxy_quantum;'
2、安装restproxy插件。
apt-get install quantum-plugin-bigswitch
3、编辑/etc/quantum/quantum.conf文件,修改core_plugin为
[DEFAULT]
core_plugin =
quantum.plugins.bigswitch.plugin.QuantumRestProxyV2
allow_overlapping_ips = False
lock_path =
<path_to_which_quantum_process_can_write_to>
其中,lock_path仅在利用包安装的时候需要设置。当从devstack安装的时候,默认的lock_path值是允许的。
4、编辑修改/etc/default/quantum-server,修改QUANTUM_PLUGIN_CONFIG为
QUANTUM_PLUGIN_CONFIG="/etc/quantum/plugins/bigswitch/restproxy.ini"
5、编辑/etc/quantum/plugins/bigswitch/restproxy.ini,设置为
[DATABASE]
sql_connection =
mysql://<username>:<password>@<database_ip>:3306/restproxy_quantum?charset=utf8
[RESTPROXY]
servers=<controller_ip:port_num>,<controller_ip:port>
serverauth=<username>:<password>
serverssl=False
样例配置为
[DATABASE]
sql_connection =
mysql://root:pass@127.0.0.1:3306/restproxy_quantum
[RESTPROXY]
servers=192.168.1.100:8080,192.168.1.101:8080
serverauth=user:pass
serverssl=False
修改完配置后关闭冲突的openvswitch-controller服务,并重启quantum-server服务。
service openvswitch-controller stop;
service quantum-server restart;
6、启动floodlight,可以查看计算节点上的ovs是否成功连接到floodlight
ant;
java -Dlogback.configurationFile=logback.xml
-jar target/floodlight.jar -cf src/main/resources/quantum.properties
Subscribe to:
Posts (Atom)