
If you have read more than two posts on this blog, or ever worked with me troubleshooting a network issue, then you know I am a huge fan boy of WireShark. I install that on everything. It is literally a part of my standard server build. While working with one of my consulting clients on an issue with Microsoft Forefront TMG 2010 and Network Load Balancing, I experienced something I cannot explain, can find nothing on Google that looks even remotely close, and yet, that I cannot rule out as a fluke. Hence, this blog post.
I can not say for sure if this was a WireShark issue, a WinPcap issue, a TMG issue, or something completely different, but four engineers all agree that something funky was going on, and what made things finally work (detailed below) seems to indicate WinPcap. We simply did not have the time and resources to dedicate to a full repro, or further testing, so I am not about to abandon WireShark. I am going to keep this experience in mind though. If you have run into problems publishing an application, where things only seem to work fully when you have WireShark installed and running a capture, read on.
The setup
Two 2008 servers running MS Forefront TMG 2010, and set up for NLB using multicast. They are guest running in an ESX environment of three servers with vMotion. They have a dedicated vSwitch for the external network, and a dedicated vSwitch for the internal. In addition to the primary VIP, the cluster has three more VIPs and of course, the dedicated ip.addrs on each NIC. WireShark is installed and running to help debug network issues. Rules are setup to publish on all four VIPs and also on each dedicated ip.addr, just for testing. The version of WireShark running on the TMGs is 1.2.10, and the version of WinPcap that was running is what came with the WireShark installer, 4.1.1. Note that I have a production cluster of TMG servers running, with NLB using multicast, and WireShark installed on both TMG servers without issue, so I am not convinced that this is definitely a problem for all situations. Something about their setup is unique; I am just not sure what it is.
The problem
With WireShark running on the TMGs, server publishing appears to work like a champ on all VIPs as well as on the physical addresses. But if you stop the captures, publishing fails on all the VIPs. There is no sign that the TMG is processing the traffic in the TMG logs. Publishing to the physical addresses worked.
In the case of a publishing rule using a VIP that redirects HTTP to HTTPS, we could make the connection on HTTP, get the 302 back with the HTTPS URL, but then when the client SYNs to 443, the TMGs do not respond, and do not log anything about the connection request. This was observed by running WireShark on the client. All SYNs to 443 go unanswered. We started the capture on the TMG servers, and everything started working again.
A good idea, that didn’t work
First we upgraded WinPcap to the latest stable release, 4.1.2. This did not seem to make any difference. We then uninstalled WinPcap and WireShark, rebooted, and got side tracked troubleshooting other possibilities. After hours spent debugging, testing, scrapping and republishing rules, we finally got the network engineer who controls the switches to do a capture on the switch itself. Since he could see frames leaving the switch on the Ethernet port the TMGs were plugged into, we decided to reinstall WireShark on the TMGs to take another look.
The gun’s not smoking, but I heard a loud noise
During the reinstall, it reported that WinPcap was already installed. Remember three sentences (and several hours) ago we uninstalled WireShark and WinPcap. Okay, that was interesting. We proceeded with the install, fired up WireShark, and everything started working again with the publishing rules on the VIPs. We stopped the capture, and it all broke with nothing in the TMG logs. It seems that either WinPcap was blocking the packets from getting to the NDIS layer, or something about it was causing the NIC to drop the frames before the operating system could see them. Since we already tried the uninstall route without success, we tore down and rebuilt the TMGs to the exact same build as before. Everything works like a champ.
Again let me make clear…I don’t know for sure what the issue is. By posting this, I hope anyone who has encountered a similar issue may find this, and either take some solace that they are not alone, or perhaps even leave a comment so I know I am not crazy…well, at least, not about this.
And now for something completely different.
Direct link for RSS and email subscribers…http://www.youtube.com/watch?v=4uLL0iqFtqE
If you found this post useful, please consider following us on twitter. You’ll be the first to learn about new posts, and, rarely, we’ll share a comedic or witty tweet. Of course, you can also leave a comment below (anonymous allowed) to let us know we hooked you up.
No related posts.






{ 1 comment… read it below or add one }
Sounds strange. I’ve certainly not seen that, nor would know why it would cause that issue. From what I understand though, Wireshark, through winpcap, shims the network stack. So I’m assuming it’s a bug with one or both, and it flakes out ALL network stack traffic, preventing TMG from even seeing anything.
I would start with winpcap or wireshark forum assistance and get as much backup documentation on it as you can. Does Wiresharp even have logging?