Message-ID: <14873972.1075840626014.JavaMail.evans@thyme>
Date: Mon, 27 Nov 2000 08:21:00 -0800 (PST)
From: keith.dziadek@enron.com
To: dan.dietrich@enron.com, tim.belden@enron.com, murray.o'neil@enron.com, 
	arshak.sarkissian@enron.com
Subject: RE: IT problems Sunday
Cc: chip.cox@enron.com, diana.willigerod@enron.com, sammy.abu-khalaf@enron.com, 
	bruce.smith@enron.com, david.poston@enron.com, steve.nat@enron.com, 
	jeff.richter@enron.com, john.forney@enron.com, mark.guzman@enron.com, 
	monika.causholli@enron.com
Mime-Version: 1.0
Content-Type: text/plain; charset=ANSI_X3.4-1968
Content-Transfer-Encoding: 7bit
Bcc: chip.cox@enron.com, diana.willigerod@enron.com, sammy.abu-khalaf@enron.com, 
	bruce.smith@enron.com, david.poston@enron.com, steve.nat@enron.com, 
	jeff.richter@enron.com, john.forney@enron.com, mark.guzman@enron.com, 
	monika.causholli@enron.com
X-From: Keith Dziadek
X-To: Dan Dietrich, Tim Belden, Murray P O'Neil, Arshak Sarkissian
X-cc: Chip Cox, Diana Willigerod, Sammy Abu-Khalaf, Bruce Smith, David Poston, Steve Nat, Jeff Richter, John M Forney, Mark Guzman, Monika Causholli
X-bcc: 
X-Folder: \mark guzman 6-28-02\Notes Folders\All documents
X-Origin: GUZMAN-M
X-FileName: mark guzman 6-28-02.nsf

Sunday, November 26, at approximately 10:30 CST, EIGRP (routing protocol) 
neighbor connections between the Portland router and the Houston router began 
to timeout.  The frequency of this occurrence was mentioned in the previous 
email from Phillip Platter.  The problem cleared up at 16:56 CST.  When 
debugging the problem, all test results pointed to a problem on the EIN.  
However when the EBS NOC was contacted, they informed us that there had been 
no known network issues or scheduled changes.  Because of what I experienced 
while debugging, I had an EBS network engineer come to my office this morning 
to  walk thru the historical log data of physical devices within the Houston 
to Portland path.  The problem was finally pinpointed to an EIN DS3 link .  A 
GSR router to which this link was attached was losing OSPF neighbor 
connections with it's peer on the other side of the link due to a bouncing 
interface, thus causing the EIGRP neighbor connectivity errors.  

Realizing this is a critical link on which 24x7 trading is conducted, the 
following actions items will be addressed:
? We will be implementing our own monitoring device (currently only able to 
monitor up/down/response time) on the edge of the EIN in an attempt to more 
proactively monitor the path being utilized.  
? I will verify with EBS that all appropriate devices are being monitored by 
the NOC and that a notification process be implemented (should include both 
problem and change notification).
? I will push the issue of gaining access to the EIN edge router.  This will 
allow us to more efficiently troubleshoot/manage/monitor EIN link status and 
errors.  Knowing the error type and frequency will allow for quicker 
determination on severity and, therefore, implementation of secondary 
solutions.



Arshak,
Because of the following concerns, it appears to me that the NOC is not 
appropriately monitoring EIN devices and, in return, negatively impacting 
Enron business.  I am copying you on this note because I need your assistance 
in identifying a contact for the NOC in which to communicate these concerns 
and to work with in implementing processes to help alleviate future issues.

Concerns:
? Time required to fix this problem (6.5 hours - I think it finally cleared 
up on its own)
? The lack of notification or knowledge on the problem
? The erroneous information being communicated to the customer (Enron Net 
Works) .  

Keith



 -----Original Message-----
From:  Dietrich, Dan  
Sent: Monday, November 27, 2000 9:35 AM
To: Belden, Tim; O'Neil, Murray
Cc: Cox, Chip; Willigerod, Diana; Abu-Khalaf, Sammy; Bruce 
Smith/HOU/ECT@ENRON; Poston, David; Nat, Steve; Dziadek, Keith; Richter, 
Jeff; Forney, John; Guzman, Mark; Causholli, Monika
Subject: IT problems Sunday

Tim and Murray,

Yesterday the real-time desk experienced dropped Terminal Server (TS) 
sessions which impacted their productivity. Diana and Chip contacted the 
network team in Houston to assist with troubleshooting the problem. Keith 
Dziadek monitored the DS3 VPN connection between Portland and Houston and has 
determined that it is the dynamic fail-over within the Enron Intelligent 
Network (EIN) which is causing the dropped TS sessions. This dynamic 
fail-over is a good thing. Our issue is that the sensitivity of the TS 
sessions is so high that the time of the dynamic fail-over on the EIN causes 
them to drop while all other applications (i.e. Enpower running locally in 
Portland) are not affected.

Keith is meeting with his team in Houston as well as staff from EBS today to 
form an action plan. Once the action plan has been established, Keith or I 
will communicate what and when it will be done to you via email.

You can reach me on my cel if you have any questions.

Regards,

Dan Dietrich
Cel: 403-818-0815


---------------------- Forwarded by Diana Willigerod/PDX/ECT on 11/26/2000 
03:33 PM ---------------------------
   
	  From:  Phillip Platter                           11/26/2000 03:19 PM
	


To: Jeff Richter/HOU/ECT@ECT, Tim Belden/HOU/ECT@ECT, John M 
Forney/HOU/ECT@ECT, Murray P O'Neil/HOU/ECT@ECT
cc: Mark Guzman/PDX/ECT@ECT, Monika Causholli/PDX/ECT@ECT, Diana 
Willigerod/PDX/ECT@ECT, Chip Cox/PDX/ECT@ECT, Paul Kane/PDX/ECT@ECT 

Subject: IT problems Sunday

Please be advised of a serious problem with Terminal Server! 

We were all kicked out of Terminal Server consistently on Sunday.  At first 
it happened about every half hour (9am to 10:30a.m).  Then it happened every 
5 minutes. 
Communication seemed slow.  We paged and called the Portland IT Team at 
11:30  (Diana & Chip).  At 12noon Diana said Chip and Paul Kane were working 
on it.   At around  12:30 Paul Kane called back and said "Vince" in Houston 
was working on it.  We asked for Vince's Phone number and Paul stated that 
Vince would call us instead. Vince called at about 1:00 and said everything 
was now working.  Five minutes later we were still getting kicked out.   We 
then paged and called Chip and he said he was trying to get ahold of 
Houston.  At 1:30 we called the Resolution Center. They indicated they had no 
clue we were having problems. At 1: 45 we called Greg Marsalis in Houston and 
he said he was unaware of who was working on it, but was going to look into 
it.

Greg did look into it and said they were still not able to pinpoint the 
problem.  At this writing,(3:10p.m.) we are still not able to use Terminal 
Server  and consequently are unable to transact with the ISO or to enter 
deals in enpower.  This has also affected our connection to PX Trade App.  We 
are effectively prevented from taking advantage of market conditions in 
California.  Earlier I had started to initiate some congestion relief  on the 
Palo Verde tie .  Cong was $225 to $250 at Palo and I was confident I could 
take advantage by moving power out of SP.  I was unable to do anything in 
Caps as I was kicked out every few minutes.

I feel the communication process regarding this problem has been flawed.  I'm 
sure there is plenty of blame to go around .  I suggest we meet with IT  to 
define a process where we will get immediate attention and results.  

As of 3:20 we still have the same problems. 

I am hoping this problem is isolated to Sunday. 


