Today we were faced with a very interesting problem related to SSIS and Connectivity. There is an ETL process which seems to be having intermittent connectivity drops. SSIS just reports a connection ending error message. No details of what and why. The ETL process is crucial to our Data Warehouse loads of which the DW is instrumental in every day business operations. The ETL process is triggered by a different process which unfortunately can be run at any time.
The data center housing the source of the ETL process’s data pull does not believe it is their issue and don’t want to trace down the root cause. The issue is not reproducible at will, but does happen consistent enough to require attention. This is where Packet Captures come into play.
Packet Captures (PCAPs) can help determine at the lowest level what is happening during these events by capturing the actual network packets being sent and received over a network interface. We can evaluate these packets and determine whether the connection is being terminated by the Source database engine or due to some other connectivity issue along the path of packet delivery.
Depending on the environment, there may be many network devices between points of communication, especially when the ETL Process is sourced on an Azure machine. If we can determine whether the connection issues are caused by the Source Database Engine or Network Packet Transmissions issues in between, we can then focus our investigation.
What to use?
The first step in accomplishing our goal is to figure out what packet capture product we want to use. We have used WireShark in the past so this was a simple choice. The free version has all the necessary tools to accomplish the goal in the end. Wireshark has an excellent command line execution method which can include Capture Filters, Duration or Packet count limits, and many other options.
How to trigger?
The next step was to determine how are we going to trigger the execution of the packet capture dynamically. This is a concern because the ETL process’s execution is dynamic and not on a set schedule. We chose to use SSIS and incorporate the packet capture as an additional Package to the whole ETL Process. This will allow for the packet capture to be placed anywhere within the ETL process and configurable to whatever the needs are at the time.
We first needed to get the syntax of the Wireshark command line correct. After a few google searches and iterations we settled on the following command to execute:
“tshark.exe -i “Ethernet” -k -a duration:3600 -w “C:\test.pcap” -f “host 192.168.0.1″”
-k = Start Capture Immediately
-i = Name of the Network Interface
-a = The duration we want to capture to take place with a value in seconds.
-w = The path of file containing the capture output.
-f = collection of capture filters. We chose to only capture source and destination information for a specific IP address.
We created a new SSIS Package named PCAP. This original thought was to use Package Variables to generate the arguments list for an Execute Process task configured to run tshark.exe. This seemed simple enough… right???
Unfortunately, we were not able to get the Execute Process to generate a valid argument list for tshark.exe to execute. We used the escape character of “\” in front of all double quotes where necessary and when we evaluated the expression within the Execute Process task it look perfect, but the execution would not take.
So we had to hard code the Execute Process Task, which this also meant hard coding the file name. This poses a problem as we need to have this package execute multiple times, capture a new file with each execution, and not overwrite the information for each distinct execution.
We chose to use a Script task after the Execute Process task, to derive a new file name and rename the capture file.
Once we had this in place, things started to work like a champ. At the end of it all, it seems we CAN have network monitoring in place during ETL processes, initiated by ETL process and managed by ETL Processes.
Written By: Warren Sifre