Tuesday, December 21, 2010

Professional Life Update – Winter 2010

Life hasn’t slowed down for the holidays in 2010!  Earlier this week I accepted a job with a new employer.  I greatly enjoyed my time with Genworth Financial, and I’ll miss working with my team and the larger IT department there.  It was a wonderful 6+ years with Genworth, and nearly a decade within the General Electric organization.

I expect that my new position will keep me very busy.  For this reason, I have bowed out of teaching CCDE courses for CCBOOTCAMP.  I want to thank Dawn & Dawn, as well as Brad, for their guidance and support in 2010.  I grew quite a bit as both an engineer and communicator through my experiences with CCBOOTCAMP, and if my schedule ever allows it, I would be happy to teach again in the future.

So as not to leave CCDE candidates ‘high and dry’, I will continue to offer my CCDE Practical practice exams on a regular schedule.  Details about the January offering can be found at http://www.jeremyfilliben.com/2010/12/january-ccde-practice-exams.html.  The response has been very positive, and I am looking forward to working with the candidates in a few weeks.

Even with the new responsibilities, I intend to post occasionally to this blog and keep active in various other social media universes.  Here’s a rundown of how/why I use various tools.  I thought this might be helpful to the community.  At the very least it’ll explain why I ignore most Facebook friend requests :)

Twitter – I’m on twitter @jfilliben.  I use this primarily for networking industry purposes.  I view twitter as a river of information, sort of like CNN Headline News or ESPN News. I don’t have time to watch it all day, but when I have a moment, I like to see what’s going on in the networking world.  I’m fairly good at keeping track of DMs and replies, but I miss a lot of the daily chatter.

LinkedIn – I’m on LinkedIn under the username jfilliben.  I have been a fan of LinkedIn for a long time.  In my early career I moved  from job to job fairly frequently, as did many others.  LinkedIn became a way to keep in contact with my former co-workers and to keep an eye out for new opportunities.  I am open to linking with anyone who has an interest or need to contact me.  I do not see a significant downside to linking with people who I don’t know terribly well.  I suppose this can serve as a warning to anyone in my LinkedIn network.. Just because I’m linked to someone doesn’t mean I would vouch for them.

Facebook – I only joined Facebook because my wife told me to.  She handles the family social circle, so I don’t see a great need for me to be on it.  It was fun reconnecting with a few former classmates, but I still don’t see the point to it all.  I pretty much keep my work life and personal life separate, so I don’t normally “friend” professional associates on Facebook.  Occasionally I’ll “like” something work-related, like the Facebook LISP group.

Blogs – I write a professional one, but you already knew that.  Prior to creating this blog I used this URL in an unsuccessful campaign for a local political office, so occasionally you’ll see a reference to that in search engines.  I also started a blog for my surname, but I abandoned it awhile ago and never got around to deleting it.  It hosts a single picture of the family from 2008.

I read a large number of technical blogs, and that number grows weekly.  I primarily use RSS via Google Reader.  If a blog doesn’t do RSS, I can’t remember to follow it.  And if a blog only publishes summaries to RSS, the content better be great or I’ll unsubscribe.  Yes, I’m talking about Greg, Jeremy and Ivan here.  I don’t think there are any other technical blogs that I’m willing to click-thru to read.  I would list my favorite blogs, but then I would forget someone and feelings would be hurt.  Lately I’ve been adding some storage and blade server blogs to the list.  Hopefully it is just a phase :)

I also read a few dozen non-technical blogs, mostly in the areas of sports, finance, economics and politics.  In fact, I get most of my daily news from blogs and websites.  I haven’t subscribed to a paper newspaper (ever!) and I don’t care for local TV news.

As for how I read all this content… It’s about a 50/50 split between my Blackberry and my laptop.  I own a Kindle, but I only use it to read traditional books.  I’m sure I would love an iPad, but I already spend enough time looking at digital screens.  I don’t want another temptation.  The Blackberry is less-than-ideal for reading, and I will almost never ‘click-thru’ on it because of the poor web browser, but it meets my needs most of the time.

What Else? – I think that’s about it.  I don’t understand the interest in location-based apps like Foursquare.  Why would I care where other people are at the moment?  I’m busy enough on my own; I don’t want to force myself into even more interactions when I’m at the supermarket, etc.  I’ve seen it used to good effect during professional conferences like Cisco Live, but that’s about it.

I also don’t bother with the ‘next best’ versions of the above apps.  So no Google Buzz or Plaxo for me, thanks.  I have enough to read as it is, so I have not used Digg or its equivalents.  I did not get very involved in web chatting apps either, although I have accounts on Yahoo Messenger, AIM and Google Chat (is that what they call it?).

Thursday, December 9, 2010

January CCDE Practice Exams

Last month I presented my first CCDE practice exams to the public.  The offering was a success, and two of the participants went on the take the November CCDE Practical Exam.  Both indicated that the offering was a great help for their preparation.

 

Upcoming CCDE Practice Exams

Registration is now open for these practice exams in January 2011.  The Enterprise exam will be delivered to participants on Wednesday, January 5th.  The review session is scheduled for Saturday, January 8th @ 9am EST.  The Service Provider exam will be delivered to participants on Wednesday, January 12th and the review will take place on Saturday, January 15th @ 9am EST.  One item of feedback I received from the initial offering was that it would be nice to have more time between the practice exams and the real exam.  This schedule gives candidates four weeks to digest the information and reach out to me for follow-up guidance before taking the exam on February 15th.  I was also asked to move the review sessions to the weekend to accommodate busy work schedules.  If you are interested in the offering, but the dates listed above do not work for you, let me know.  We can work to find an alternate date/time to complete the review session.

The Registration pages for this offering can be found at:

Enterprise Exam

Service Provider Exam

Combo (Both exams at a discounted cost)

Please email me with any questions you may have (jeremy@filliben.com).   As before, the first hour of the review session will include my presentation and an open discussion on the CCDE Practice Exam (format, testing experience, future developments).  The CCDE program is constantly being updated, and I do my best to keep this information up to date.  Former CCDE Practice Exam participants are invited to join this session to receive updated information.

Other News

I regret to announce that I will not be teaching the upcoming live CCDE Practical Bootcamp offered by CCBootcamp.  After reviewing my work/life balance for 2010, and my already full professional calendar for 2011, I cannot find the time to participate in this program.  I wish CCBootcamp the best of luck with their CCDE training.  I am confident that they will find a wonderful instructor to take my place.

Friday, November 19, 2010

Python Texas Hold’em Simulator

Update

I finally got around to looking at this again, and I decided to re-write the program. I also posted the new version to GitHub. You can find it at https://github.com/jfilliben/poker-sim. Feel free to disregard the below info, except perhaps to learn a lot of things you shouldn't do with Python ;)


Original

In addition to my practical Python script for Cisco switch configuration verification, I wrote a purely impractical Texas Hold’em Simulator.  I enjoy the game, and it occurred to me that a poker simulator would be a fun way to explore Monte Carlo simulations.  The following program takes as an input a count of simulations to perform and two player hands, plus five community cards.  For any community card specified with “Xx”, it will randomly select a card and play out the hand.  At it’s simplest form, you can run it with one iteration and a fully-specified set of cards, and it will tell you which hand wins:
C:>python.exe pokersim.py 1 Ac7d 8sKd 2s3s4s4c4d
Total Hands: 1
Hand1: 1 Hand2: 0 Ties: 0
Hand1: 100.0% Hand2: 0.0% Ties: 0.0%

Or you can leave a couple cards ‘blank’ and see what random turn and river cards will provide:
C:>python.exe pokersim.py 1000 Ac7d 8sKd 2s3sXxXxXx
Total Hands: 1000
Hand1: 612 Hand2: 385 Ties: 3
Hand1: 61.2% Hand2: 38.5% Ties: 0.3%
(note, this program uses a Monte Carlo simulation, so there is a measure of randomness to the results.  If you run the same test multiple times, you will almost always get slightly different results.  And of course, the more iterations you choose to do, the more accurate the results)
Here is the script.  Again, any comments would be appreciated.  If there is a more efficient way to accomplish portions of this program, I’d love to hear of them.  And if you spot any errors, please let me know.  As a disclaimer, there is no guarantee that the results of this script are accurate.  I don’t use this for profit in any way, it was purely a thought exercise for me.
Run it with no options (or “——help”) to get a brief description of the options.
----------
pokersim.py:
#
# pokersim.py - Runs a Monte Carlo simulation of two Texas Hold'em hands
#               with user-specified (or random) community cards
#
# Work to be done:
# Add exhaustive search?
# Compare speed of copy.deepcopy and hand_copy()
# Add input checking for user input
#
import sys
import random
def hand_copy(cards):
#
# Replace copy.deepcopy with specific copy function to speed things up
# (presumably, but not tested)
#
    results = []
    for i, v in enumerate(cards):
        results.append(cards[i])
    return results
def legal_hand(cards):
#
# Returns 1 if hand is legal
# Returns 0 if hand is illegal (two of same card)
#
    for i, v in enumerate(cards):
        if cards.count(v) > 1: return 0
        elif cards.count([-1, -1]): return 0
    return 1
def valid_card(card):
#
# Returns 1 if card is a valid card in text format (rank in (A-2),
#  suit in (c, d, h, s) or wildcard (Xx)
# Returns 0 if card is invalid
#
    if card[0] in ("X", "x", "A", "a", "K", "k", "Q", "q", "J", "j"\
    , "T", "t", "9", "8", "7", "6", "5", "4", "3", "2"):
        if card[1] in ("x", "X", "c", "C", "d", "D", "h", "H", "s", "S"):
            return 1
    else: return 0
def readable_hand(cards):
#
# Returns a readable version of a set of cards
#
    string = ""
    for i, v in enumerate(cards):
        if v[0] == 0: string += "2"
        elif v[0] == 1: string += "3"
        elif v[0] == 2: string += "4"
        elif v[0] == 3: string += "5"
        elif v[0] == 4: string += "6"
        elif v[0] == 5: string += "7"
        elif v[0] == 6: string += "8"
        elif v[0] == 7: string += "9"
        elif v[0] == 8: string += "T"
        elif v[0] == 9: string += "J"
        elif v[0] == 10: string += "Q"
        elif v[0] == 11: string += "K"
        elif v[0] == 12: string += "A"
        elif v[0] == -1: string += "X"
        if v[1] == 0: string += "c"
        elif v[1] == 1: string += "d"
        elif v[1] == 2: string += "h"
        elif v[1] == 3: string += "s"
        elif v[1] == -1: string += "x"
    return string
def hand_to_numeric(cards):
#
# Converts alphanumeric hand to numeric values for easier comparisons
# Also sorts cards based on rank
#
    result = []
    for i, v in enumerate(cards):
        currentcard = [0, 0]
        if cards[i][0] == "2": currentcard[0] = 0
        elif cards[i][0] == "3": currentcard[0] = 1
        elif cards[i][0] == "4": currentcard[0] = 2
        elif cards[i][0] == "5": currentcard[0] = 3
        elif cards[i][0] == "6": currentcard[0] = 4
        elif cards[i][0] == "7": currentcard[0] = 5
        elif cards[i][0] == "8": currentcard[0] = 6
        elif cards[i][0] == "9": currentcard[0] = 7
        elif cards[i][0] in ("t","T"): currentcard[0] = 8
        elif cards[i][0] in ("j","J"): currentcard[0] = 9
        elif cards[i][0] in ("q","Q"): currentcard[0] = 10
        elif cards[i][0] in ("k","K"): currentcard[0] = 11
        elif cards[i][0] in ("a","A"): currentcard[0] = 12
        elif cards[i][0] in ("x","X"): currentcard[0] = -1
        if cards[i][1] in ("c","C"): currentcard[1] = 0
        elif cards[i][1] in ("d","D"): currentcard[1] = 1
        elif cards[i][1] in ("h","H"): currentcard[1] = 2
        elif cards[i][1] in ("s","S"): currentcard[1] = 3
        elif cards[i][1] in ("x","X"): currentcard[1] = -1
        result.append(currentcard)
    result.sort()
    result.reverse()
    return result
def check_flush(hand):
# Return 0 if not true
# Return 1 if true
#
# Initialization
#
    hand_suit = []
    hand_suit.append(hand[0][1])
    hand_suit.append(hand[1][1])
    hand_suit.append(hand[2][1])
    hand_suit.append(hand[3][1])
    hand_suit.append(hand[4][1])
    for i in range(0, 4):
        if hand_suit.count(i) == 5: return 1
    return 0
def check_straight(hand):
# Return 0 if not true
# Return 1 if true
    if hand[0][0] == (hand[1][0] + 1) == (hand[2][0] + 2) == (hand[3][0] + 3)\
    == (hand[4][0] + 4): return 1
    elif (hand[0][0] == 12) and (hand[1][0] == 3) and (hand[2][0] == 2)\
    and (hand[3][0] == 1) and (hand[4][0] == 0): return 1
    return 0
def check_straightflush(hand):
# Return 0 if not true
# Return 1 if true
    if check_flush(hand) and check_straight(hand): return 1
    return 0
def check_fourofakind(hand):
# Return 0 if not true
# Return 1 if true
# Also returns rank of four of a kind card and rank of fifth card
# (garbage value if no four of a kind)
    hand_rank = []
    hand_rank.append(hand[0][0])
    hand_rank.append(hand[1][0])
    hand_rank.append(hand[2][0])
    hand_rank.append(hand[3][0])
    hand_rank.append(hand[4][0])
    for value in range (0, 13):
        if hand_rank.count(value) == 4:
            for n in range (0, 13):
                if hand_rank.count(n) == 1: return 1, value, n
    return 0, 13, 13
def check_fullhouse(hand):
# Return 0 if not true
# Return 1 if true
# Also returns rank of three of a kind card and two of a kind card
# (garbage values if no full house)
    hand_rank = []
    hand_rank.append(hand[0][0])
    hand_rank.append(hand[1][0])
    hand_rank.append(hand[2][0])
    hand_rank.append(hand[3][0])
    hand_rank.append(hand[4][0])
    for value in range(0, 13):
        if hand_rank.count(value) == 3:
            for n in range(0, 13):
                if hand_rank.count(n) == 2: return 1, value, n
    return 0, 13, 13
def check_threeofakind(hand):
# Return 0 if not true
# Return 1 if true
# Also returns rank of three of a kind card and remaining two cards
# (garbage values if no three of a kind)
    hand_rank = []
    hand_rank.append(hand[0][0])
    hand_rank.append(hand[1][0])
    hand_rank.append(hand[2][0])
    hand_rank.append(hand[3][0])
    hand_rank.append(hand[4][0])
    for value in range(0, 13):
        if hand_rank.count(value) == 3:
            for n in range(0, 13):
                if hand_rank.count(n) == 1:
                    for m in range(n+1, 13):
                        if hand_rank.count(m) == 1: return 1, value, [m, n]
    return 0, 13, [13, 13]
def check_twopair(hand):
# Return 0 if not true
# Return 1 if true
# Also returns ranks of paired cards and remaining card
# (garbage values if no two pair)
    value = 0
    hand_rank = []
    hand_rank.append(hand[0][0])
    hand_rank.append(hand[1][0])
    hand_rank.append(hand[2][0])
    hand_rank.append(hand[3][0])
    hand_rank.append(hand[4][0])
    for value in range(0, 13):
        if hand_rank.count(value) == 2:
            for n in range(value+1, 13):
                if hand_rank.count(n) == 2:
                    for m in range(0, 13):
                        if hand_rank.count(m) == 1: return 1, [n, value], m
    return 0, [13, 13], 13
def check_onepair(hand):
# Return 0 if not true
# Return 1 if true
# Also returns ranks of paired cards and remaining three cards
# (garbage values if no pair)
    hand_rank = []
    hand_rank.append(hand[0][0])
    hand_rank.append(hand[1][0])
    hand_rank.append(hand[2][0])
    hand_rank.append(hand[3][0])
    hand_rank.append(hand[4][0])
    for value in range(0, 13):
        if hand_rank.count(value) == 2:
            for n in range (0, 13):
                if hand_rank.count(n) == 1:
                    for m in range(n+1, 13):
                        if hand_rank.count(m) == 1:
                            for o in range (m+1, 13):
                                if hand_rank.count(o) == 1: return 1, value, [o, m, n]
    return 0, 13, [13, 13, 13]
def highest_card(hand1, hand2):
# Return 0 if hand1 is higher
# Return 1 if hand2 is higher
# Return 2 if equal
#
# Initialization
#
    hand1_rank = []
    hand1_rank.append(hand1[0][0])
    hand1_rank.append(hand1[1][0])
    hand1_rank.append(hand1[2][0])
    hand1_rank.append(hand1[3][0])
    hand1_rank.append(hand1[4][0])
    hand2_rank = []
    hand2_rank.append(hand2[0][0])
    hand2_rank.append(hand2[1][0])
    hand2_rank.append(hand2[2][0])
    hand2_rank.append(hand2[3][0])
    hand2_rank.append(hand2[4][0])
#
# Compare
#
    if hand1_rank > hand2_rank: return 0
    elif hand1_rank < hand2_rank: return 1
    return 2
def highest_card_straight(hand1, hand2):
# Return 0 if hand1 is higher
# Return 1 if hand2 is higher
# Return 2 if equal
#
# Compare second card first (to account for Ace low straights)
# if equal, we could have Ace low straight, so compare first card. 
# If first card is Ace, that is the lower straight
#
    if hand1[1][0] > hand2[1][0]: return 0
    elif hand1[1][0] < hand2[1][0]: return 1
    elif hand1[0][0] > hand2[0][0]: return 1
    elif hand1[0][0] < hand2[0][0]: return 0
    return 2
def compare_hands(hand1, hand2):
#
# Compare two hands
# Return 0 if hand1 is better
# Return 1 if hand2 is better
# Return 2 if equal
#
#
# Initialization
#
    result1 = []
    result2 = []
#
# Check for straight flush
#
    if check_straightflush(hand1):
        if check_straightflush(hand2):
            return(highest_card_straight(hand1, hand2))
        else: return 0
    elif check_straightflush(hand2): return 1
#
# Check for four of a kind
#
    result1 = check_fourofakind(hand1)
    result2 = check_fourofakind(hand2)
    if result1[0] == 1:
        if result2[0] == 1:
            if result1[1] > result2[1]: return 0
            elif result1[1] < result2[1]: return 1
            elif result1[2] > result2[2]: return 0
            elif result1[2] < result2[2]: return 1
            else: return 2
        else: return 0
    elif result2[0] == 1: return 1
#
# Check for full house
#
    result1 = check_fullhouse(hand1)
    result2 = check_fullhouse(hand2)
    if result1[0] == 1:
        if result2[0] == 1:
            if result1[1] > result2[1]: return 0
            elif result1[1] < result2[1]: return 1
            elif result1[2] > result2[2]: return 0
            elif result1[2] < result2[2]: return 1
            else: return 2
        else: return 0
    elif result2[0] == 1: return 1
#
# Check for flush
#
    if check_flush(hand1):
        if check_flush(hand2):
            return(highest_card(hand1, hand2))
        else: return 0
    elif check_flush(hand2): return 1
#
# Check for straight
#
    if check_straight(hand1):
        if check_straight(hand2):
            temp = highest_card_straight(hand1, hand2)
            return temp
        else: return 0
    elif check_straight(hand2): return 1
#
# Check for three of a kind
#
    result1 = check_threeofakind(hand1)
    result2 = check_threeofakind(hand2)
    if result1[0] == 1:
        if result2[0] == 1:
            if result1[1] > result2[1]: return 0
            elif result1[1] < result2[1]: return 1
            elif result1[2] > result2[2]: return 0
            elif result1[2] < result2[2]: return 1
            else: return 2
        else: return 0
    elif result2[0] == 1: return 1
#
# Check for two pair
#
    result1 = check_twopair(hand1)
    result2 = check_twopair(hand2)
    if result1[0] == 1:
        if result2[0] == 1:
            if result1[1] > result2[1]: return 0
            elif result1[1] < result2[1]: return 1
            elif result1[2] > result2[2]: return 0
            elif result1[2] < result2[2]: return 1
            else: return 2
        else: return 0
    elif result2[0] == 1: return 1
#
# Check for one pair
#
    result1 = check_onepair(hand1)
    result2 = check_onepair(hand2)
    if result1[0] == 1:
        if result2[0] == 1:
            if result1[1] > result2[1]: return 0
            elif result1[1] < result2[1]: return 1
            elif result1[2] > result2[2]: return 0
            elif result1[2] < result2[2]: return 1
            else: return 2
        else: return 0
    elif result2[0] == 1: return 1
    return (highest_card(hand1, hand2))
def bestfive(hand, community):
#
# Takes hand and community cards in numeric form and returns best five cards
#
    currentbest = hand_copy(community)
    currentbest.sort()
    currentbest.reverse()
    m = 0
#
# Compare current best to five cards including only one player card
#
    for m in range (0, 2):
        for n in range (0, 5):
            comparehand = hand_copy(community)
            comparehand[n] = hand[m]
            comparehand.sort()
            comparehand.reverse()
            if compare_hands(currentbest, comparehand) == 1:
                currentbest = hand_copy(comparehand)
#
# Compare current best to five cards including both player cards
#
    for m in range (0, 5):
        for n in range (m+1, 5):
            comparehand = hand_copy(community)
            comparehand[m] = hand[0]
            comparehand[n] = hand[1]
            comparehand.sort()
            comparehand.reverse()
            if compare_hands(currentbest, comparehand) == 1:
                currentbest = hand_copy(comparehand)
    return currentbest
#
# Main Program Body
#
#
# Initialization
#
hand1 = []
handnum1 = []
best_hand1 = []
hand2 = []
handnum2 = []
best_hand2 = []
community = []
communitytemp = []
totals = [0,0,0]
iterations = 0
#
# Process command-line arguments
#
if (len(sys.argv) == 1) or (sys.argv[1] in ("-h", "--help")):
        sys.exit("\n\
First input is number of iterations to run the Monte Carlo simulation\n\
Input cards in format [RANK][SUIT], as in Ace Clubs + Four Diamonds = Ac4d)\n\
Input should be two cards for player 1, two cards for player 2 and five community cards\n\
Wildcards should be written as Xx (capital X for rank, lower-case x for suit)\n\
Wildcards should be placed at the end of the community hand\n\n\
--help: This message\n")
else:
    iterations = int(sys.argv[1])
    if iterations < 1: iterations = 1
    if valid_card(sys.argv[2][0:2]): hand1.append(sys.argv[2][0:2])
    else: sys.exit("Player 1 Card 1 Invalid")
    if valid_card(sys.argv[2][2:4]): hand1.append(sys.argv[2][2:4])
    else: sys.exit("Player 1 Card 2 Invalid")
    if valid_card(sys.argv[3][0:2]): hand2.append(sys.argv[3][0:2])
    else: sys.exit("Player 2 Card 1 Invalid")
    if valid_card(sys.argv[3][2:4]): hand2.append(sys.argv[3][2:4])
    else: sys.exit("Player 2 Card 2 Invalid")
    if valid_card(sys.argv[4][0:2]): community.append(sys.argv[4][0:2])
    else: sys.exit("Community Card 1 Invalid")
    if valid_card(sys.argv[4][2:4]): community.append(sys.argv[4][2:4])
    else: sys.exit("Community Card 2 Invalid")
    if valid_card(sys.argv[4][4:6]): community.append(sys.argv[4][4:6])
    else: sys.exit("Community Card 3 Invalid")
    if valid_card(sys.argv[4][6:8]): community.append(sys.argv[4][6:8])
    else: sys.exit("Community Card 4 Invalid")
    if valid_card(sys.argv[4][8:10]): community.append(sys.argv[4][8:10])
    else: sys.exit("Community Card 5 Invalid")
handnum1 = hand_to_numeric(hand1)
handnum2 = hand_to_numeric(hand2)
#
#
# Monte Carlo Simulation
#
#
for n in range (0, iterations):
    communitytemp = hand_to_numeric(community)
    while not legal_hand(handnum1 + handnum2 + communitytemp):
        for i, v in enumerate(community):
            if community[i][0] in ("X", "x"):
                communitytemp[i] = [random.randrange(0,13), random.randrange(0,4)]
    best_hand1 = bestfive(handnum1, communitytemp)
    best_hand2 = bestfive(handnum2, communitytemp)
    totals[compare_hands(best_hand1, best_hand2)] += 1
print "\nTotal Hands: " + str(totals[0]+totals[1]+totals[2])
print "Hand1: " + str(totals[0]) + " Hand2: " + str(totals[1]) + " Ties: " + str(totals[2])
print "Hand1: " + str(round((100*(totals[0])/((totals[0]+totals[1]+totals[2])+0.0)), 2))\
+ "% Hand2: " + str(round(((100*totals[1])/((totals[0]+totals[1]+totals[2])+0.0)), 2))\
+ "% Ties: " + str(round(((100*totals[2])/((totals[0]+totals[1]+totals[2])+0.0)), 2)) + "%"

Switchport Verification Script

A commenter on my previous post directed me to the module CiscoConfParse which would have likely made this a much easier exercise.  Even so, this was a great exercise for me to go through, as it allowed me to learn a new programming language.
I’ll still post my solution to this problem, along with the supporting files.  The first file below is the Python script.  Running it with a –h option will show you the proper format for the parameters.  The supporting file “verification.txt” is also listed below.  It describes the format to use for the verification seed file.
If you happen to see anything wrong with this script, or room for improvement, please share your thoughts via comment or email.  I am always looking to learn, and I know there is room for improvement in my Python programming!
Jeremy

--------------------------
configcheck.py:
#
# configcheck.py - A script that verifies the existence or
#   absence of specific configuration lines in a Cisco switch config
#
import os
import re
import sys
import getopt
#
# Checks router interface configuration against a base config
#
def get_config_snmp():
    pass
def get_config_file(filename):
    """getconfig loads a router config and returns an array with individual lines"""
    if os.path.isfile(filename) == 0:
        sys.exit("ERROR: File does not exist")
    contents = []
    with open(filename, 'r') as file:
        for a in file: contents.append(a)
    file.closed
#
# Clear '/r', '/n', ' ' characters from file (to allow interchangeability between Unix/Windows)
#
    for x in range(0, len(contents)):
        if len(contents[x]) > 1:
            while contents[x][-1:] in ('\r', '\n', ' '): contents[x] = contents[x][:-1]
    return contents
def gethostname(config):
    """gethostname retrieves the device hostname from configuration file"""
    n = 0
    for n in range (0, len(config)):
        if len(config[n-1]) > 9:
            if config[n-1][:8] == "hostname":
                return config[n-1][9:]
    return "No Hostname Found"
def check_interfaces(config, int_type, verify_config):
    """check_interfaces retrieves interface configuration\
from the config and sends it to verify"""
    n = -1
    while n <= (len(config) - 2):
        n += 1
        if len(config[n]) > 9:
            if config[n][:9] == "interface":
                interface = []
                interface.append(config[n])
                n += 1
                while config[n][0] == ' ':
                    interface.append(config[n])
                    n += 1
                if re.search(int_type, interface[1]):
                    verify(interface, verify_config)
def verify(interface, verify_config):
    """verify takes an interface config and a list of required statements\
and verifies that each is present"""
    s = "\n\n! The Following Interface is Missing the Command(s) Below: \n\n"\
    + interface[0] + '\n' + interface[1]
    error = 0
    for a in verify_config:
        if a[1] == "!":
           b = " " + a[2:] + " "
           if interface.count(b):
               error += 1
               if b[1:2] == "no":
                   s += " " + b[3:] + '\n'
               else:
                   s += " no" + b + '\n'
        elif not interface.count(a):
            error += 1
            s += a + '\n'
#Print Error Interfaces
    if error: print s
#
# Main Program Body
#
#
# Process command-line arguments
#
config = []
verify_file = "verification.txt"
try:
    opts, args = getopt.getopt(sys.argv[1:], "hs:f:v:", ["help", "snmp=", "file=", "verify="])
except getopt.GetoptError, err:
    # print help information and exit:
    print str(err) # will print something like "option -a not recognized"
    sys.exit(2)
for o, a in opts:
    if o in ("-h", "--help"):
        sys.exit("\nOptions:\n\
-s, --snmp: Use SNMP to retrieve configuration (not implemented in this version)\n\
-f, --file: Load configuration from local file\n\
-v, --verify: Load verification file from local file (default is verification.txt)\n\
-h, --help: This message")
    elif o in ("-v", "--verify"):
        verify_file = a
    elif o in ("-s", "--snmp"):
        config = get_config_snmp(a)
        break
    elif o in ("-f", "--file"):
        config = get_config_file(a)
        break
    else:
        assert False, "unhandled option"
print "\n\n\nHostname (from configuration file): " + str(gethostname(config))
#
# Cycle through verification.txt to retrieve individual verification configuration
#
contents = []
with open(verify_file, 'r') as file:
    for a in file:
        if len(a) >= 4:
            if a[0] == '[':
                int_type = a[1:5]
            elif a[0] == ' ':
#
# Clear '/r', '/n', ' ' characters from file (to allow interchangeability between Unix/Windows)
#
                while a[-1:] in ('\r', '\n', ' '): a = a[:-1]
                contents.append(a)
            elif a[0] == "#":
                pass
        elif a[0] == "#":
            pass
        else:
            check_interfaces(config, int_type, contents)
            contents = []
file.closed
------------------------------------------------------
verification.txt:
#
# Comments begin with "#"
# Commands that must not be in the switchport configuration should begin with " !"
# Each interface type is headed with [XXXX]
# There is to be a blank line between each type
#
[ENDU]
switchport access vlan 10
switchport mode access
!mls qos trust dscp
service-policy input END-USER-IN
ip dhcp snooping limit rate 100
spanning-tree portfast
spanning-tree bpduguard enable
no logging event link-status
no snmp trap link-status
load-interval 30
switchport voice vlan 12
[SRVR]
!mls qos trust dscp
service-policy input SERVER-IN
ip dhcp snooping limit rate 100
spanning-tree portfast
spanning-tree bpduguard enable
logging event link-status
no snmp trap link-status
load-interval 30
[MPBX]
switchport mode access
mls qos trust dscp
flowcontrol receive desired
flowcontrol send off
ip dhcp snooping limit rate 100
logging event link-status
snmp trap link-status
load-interval 30
-------------
And to be complete, here is a sample switch configuration to use
sampleswitch.log:
!
hostname SampleSwitch
!
interface GigabitEthernet1/2
description ENDU; User A - Properly Configured
switchport
switchport access vlan 10
switchport mode access
switchport voice vlan 12
no logging event link-status
load-interval 30
wrr-queue bandwidth 5 25 70
wrr-queue queue-limit 5 25 40
wrr-queue random-detect min-threshold 1 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 2 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 3 50 60 70 80 90 100 100 100
wrr-queue random-detect max-threshold 1 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 2 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 3 60 70 80 90 100 100 100 100
wrr-queue cos-map 1 1 1
wrr-queue cos-map 2 1 0
wrr-queue cos-map 3 1 4
wrr-queue cos-map 3 2 2
wrr-queue cos-map 3 3 3
wrr-queue cos-map 3 4 6
wrr-queue cos-map 3 5 7
mls qos trust dscp
flowcontrol receive desired
flowcontrol send off
spanning-tree portfast
spanning-tree bpduguard enable
service-policy input END-USER-IN
ip dhcp snooping limit rate 100
no snmp trap link-status
!
interface GigabitEthernet1/3
description SRVR; Server A - Properly Configured
switchport
switchport access vlan 15
switchport mode access
logging event link-status
load-interval 30
wrr-queue bandwidth 5 25 70
wrr-queue queue-limit 5 25 40
wrr-queue random-detect min-threshold 1 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 2 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 3 50 60 70 80 90 100 100 100
wrr-queue random-detect max-threshold 1 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 2 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 3 60 70 80 90 100 100 100 100
wrr-queue cos-map 1 1 1
wrr-queue cos-map 2 1 0
wrr-queue cos-map 3 1 4
wrr-queue cos-map 3 2 2
wrr-queue cos-map 3 3 3
wrr-queue cos-map 3 4 6
wrr-queue cos-map 3 5 7
mls qos trust dscp
flowcontrol receive desired
flowcontrol send off
spanning-tree portfast
spanning-tree bpduguard enable
service-policy input SERVER-IN
ip dhcp snooping limit rate 100
no snmp trap link-status
!
interface GigabitEthernet1/4
description SRVR; Purposely Misconfigured
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
logging event link-status
load-interval 30
wrr-queue bandwidth 5 25 70
wrr-queue queue-limit 5 25 40
wrr-queue random-detect min-threshold 1 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 2 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 3 50 60 70 80 90 100 100 100
wrr-queue random-detect max-threshold 1 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 2 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 3 60 70 80 90 100 100 100 100
wrr-queue cos-map 1 1 1
wrr-queue cos-map 2 1 0
wrr-queue cos-map 3 1 4
wrr-queue cos-map 3 2 2
wrr-queue cos-map 3 3 3
wrr-queue cos-map 3 4 6
wrr-queue cos-map 3 5 7
mls qos trust dscp
flowcontrol receive desired
flowcontrol send off
spanning-tree portfast
spanning-tree bpduguard enable
ip dhcp snooping trust
!
interface GigabitEthernet1/5
description ENDU; Purposely Misconfigured
switchport
switchport access vlan 30
switchport mode access
switchport voice vlan 125
no logging event link-status
load-interval 30
wrr-queue bandwidth 5 25 70
wrr-queue queue-limit 5 25 40
wrr-queue random-detect min-threshold 1 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 2 80 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 3 50 60 70 80 90 100 100 100
wrr-queue random-detect max-threshold 1 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 2 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 3 60 70 80 90 100 100 100 100
wrr-queue cos-map 1 1 1
wrr-queue cos-map 2 1 0
wrr-queue cos-map 3 1 4
wrr-queue cos-map 3 2 2
wrr-queue cos-map 3 3 3
wrr-queue cos-map 3 4 6
wrr-queue cos-map 3 5 7
snmp trap mac-notification change added
mls qos trust dscp
flowcontrol receive desired
flowcontrol send off
spanning-tree portfast
spanning-tree bpduguard enable
service-policy input END-USER-IN
ip dhcp snooping limit rate 100

Tuesday, November 16, 2010

Python Programming and the Network Engineer

I’ve recently rediscovered my interest in computer programming.  As I mentioned before, I once was an avid programmer.  In my youth I hacked around in C++ and x86 assembler, making interesting little programs that didn’t accomplish much, but which gave me a great sense of accomplishment.  In high school, I was a member of the third place team in a national computer programming competition.  One of my teammates is now a software engineer at Google, so maybe I wasn’t exactly the star of the show, but my contributions seemed important at the time.  :)
My college years went a long way toward suppressing any interest I had in programming computers.  The rote, repetitive routine of programming compilers, operating systems, database engines, etc, drove me directly into a career of infrastructure work.  I have no professional regrets, but I do sort of miss the thrill of writing working code.  There is a great sense of satisfaction derived from formulating an idea and seeing it through from a blank Notepad++ document to a working, debugged applicati0n.
A couple of months ago I was faced with a project that resurrected my interest in programming.  My team and I have been working steadily toward standardizing our network deployments.  Standardization is a key requirement to effectively managing a large network with a small number of engineers.  Prior to this effort, such things as VLAN numbering and port-level configurations varied between our locations, even when two offices were roughly similar in size.  One of the causes of this was the melding of several different IT organizations when my employer was created.  We also acquired several smaller companies since our genesis in 2004.
To accomplish our switchport standardization, we use the description field to identify the type of device that is connected to a specific switchport.  If a PBX is connected to a port, the description field would be something like:
interface Gigabit1/1
description PBX; Name_of_PBX

and if the port is connected to an end user, we would use:
interface Gigabit1/2
description USR; John Doe – patch-panel 1A

The text after the semi-colon is free-form, and can be defined on a site-by-site basis to be whatever is relevant to the local support team.
PBX ports require a different set of configuration statements than USR ports.  For one example, we need a “DSCP trust” statement on the PBX port, while we remark USR traffic with a specific QoS policy using the “service-policy input USR-POLICY” command.
Our standardization effort is now largely complete.  Our latest challenge is making sure our configurations stay standardized.  VLAN identifiers are easy, since it is quite painful to change them.  Switchport configurations are a very different story.  It is quite easy for a switchport to be configured incorrectly but still work in a suboptimal fashion.  For example, if the wrong QoS marking policy is installed, basic testing will work (ping, etc), but under load there will likely be performance issues.  Even if the proper template is used to activate a port, it can be difficult to prevent that port from being reused for another purpose.
My solution to this issue was to implement an audit process that would take a baseline configuration and compare it to our production switch configurations.  I first reached out to our primary network tools provider, Solarwinds.  Their current feature provides some of the features I need, such as verifying that specific lines exist in the configuration.  But they cannot yet do context-sensitive configuration checking.  Other vendors (such as Netcordia) could perform this task, but I had recently championed a network tools consolidation project, so it would be a bit hypocritical of me to request funding for a new tool.  I decided to learn a scripting language to try and tackle this need.  A friend suggested that Python or Perl would fit the need nicely, so I chose Python (for no particularly good reason) and began learning.
I was quite surprised as how easily I learned the language and was able to solve my problem.  I spent a total of eight hours from the time I settled on Python until I had a working prototype.  The small investment of time is primarily due to the ease of use of the Python language, and not my programming acumen.
The entire script is a few hundred lines, including comments, and it meets all of my needs.  I wouldn’t say it is terribly user friendly, and I have a laundry list of additional features and ‘opportunities for improvement’, but I am happy with the result.  If I can find the time (and there is interest), I’ll clean it up and publish it in separate blog post.  It is generic enough that other organizations can probably use it without a lot of rewriting.

Monday, October 25, 2010

CCDE Practice Exam Offering

I am breaking from my normal technology writing to unveil a new CCDE Practical practice exam opportunity.

What Is It?

This CCDE practice exam offering is intended to replicate the style and difficulty of an individual scenario presented during the CCDE Practical Exam (352-011).  Each scenario begins with a multi-page overview document, followed by up to 25 technical questions and additional documents.  The questions are in the style of actual CCDE Practical exam questions, similar to those found in the CCDE Practical Demo on Cisco Learning Network.  I am limiting this offering to a small number of candidates for this initial attempt.

What Is It Not?

This is not an actual graded exam.  It did not come from Cisco, and was not used during a real CCDE exam.  This exam was also not built using the Adobe Flash engine.  It is delivered via PDF.  The question styles and the technical difficulty of the questions closely follows the actual exam, but none of these questions will be found on an actual exam.  In other words, no NDAs were violated during the construction of these exams.

What Will You Receive?

Two days before the scheduled Webex session you will receive an email with two attachments.  The first is an overview of the CCDE Practical Exam.  It describes the exam environment, the structure of the exam and the types of questions you will receive during the exam.  You will also receive guidance on how to complete the practice exam.  For example, this exam is intended to be closed book.

The second attachment is the practice exam scenario, in PDF format.  At the current time, there are four different practice exams available, each consisting of approximately twenty questions.  Two of the exams are Enterprise-based, and two are focused on a Service Provider network.  All CCDE candidates registering for a specific practice exam session will receive the same exam.  The exam is intended to take 60 to 90 minutes to complete. 

The last item you will receive is a login id/password for the three hour exam review session.  The date and time of the session is clearly listed on the Eventbrite registration page.  Due to my schedule, I will be unable to offer multiple review sessions for the same exam.  Please be certain to clear your schedule for the duration of the review session.  If an illness or unforeseen emergency prevents an individual from attending the review session we will make an effort to schedule a second session, but I cannot guarantee my availability.  During the review session we will work our way through the exam and I will show you where to find the information to correctly answer each question.  This review session is intended to be highly interactive.  Candidate questions are encouraged and expected… I want you to challenge my answers and assumptions.

Pricing and Registration

A single practice exam is priced at $695 (US Currency).  A pair of practice exams is priced at $995.  Due to limited timing before the next CCDE Practical offering (scheduled for Friday, November 12th), I will only be able to offer two practice exams at this time.  To appeal to the widest audience, I have chosen one Enterprise exam and one Service Provider-based exam for this offering.  The Enterprise exam will be distributed to registered candidates on Tuesday, November 2nd and reviewed on Thursday, November 4th from 9am to noon (Eastern US Time Zone).  The Service Provider exam will be distributed to registered candidates on Friday, November 5th and reviewed on Tuesday, November 9th from 9am to noon (Eastern US Time Zone).

The Practice Exam Registration pages can be found at the following links:

Single Enterprise Exam

Single Service Provider Exam

Both Practice CCDE Exams

Why Am I Doing This?

The number one complaint I’ve heard about the CCDE program is the lack of practice opportunities for the practical exam.  Cisco did a great service to the candidates by publishing a sample exam on Cisco Learning Network.  Without it I would have struggled greatly with the format of the exam. But Cisco’s demo exam does not even begin to represent the amount of reading or the depth of the technical questions on the actual test.  My practice exam offering replicates the actual exam’s technical depth and ambiguity.  Those candidates who have already sat for the practical exam know what I mean by this!  :)

I have taught several week-long CCDE Practical courses for a well-known Cisco Learning Solutions Partner.  They have been very rewarding experiences for me, and based on the feedback, for the students too.  While I fully intend to continue teaching that course, I also want to offer a resource for candidates who cannot take a week-long break from their normal lives to prepare for this certification program.  If you have recently taken the full week training offering, you are already familiar with the two exams included in this specific offering.  They have been updated based on the students’ feedback and comments.

Additional Information

I am limiting these events to a small number of participants.  It isn’t clear to me how many candidates I can include in a single WebEx session while maintaining the interactive nature of the event.  I may increase the enrollment for future offerings, depending on how well this initial attempt goes.

If you have any questions about this opportunity, please email me at jeremy@filliben.com, or post a comment to this post.

Friday, October 15, 2010

A Comparison of Current Spanning-Tree Elimination Strategies

As I mentioned in the last post, I attended the Net Tech Field Day event hosted by Gestalt IT in September. My focus in attending was on Data Center switching technologies. Of particular interest to me was the methods by which each vendor is attempting to eliminate spanning-tree from the data center. While I have been keeping my eye on TRILL and 802.1aq, I am more interested in how vendors are solving this issue today.
All of the current solutions can be described as Multi-Chassis Link Aggregation (MLAG) methods. Cisco has three solutions available for this purpose. The 3750 and 2975 switches perform chassis aggregation via proprietary stacking cables. This stacking feature allows a network engineer to create a single switch out of multiple physical devices. All devices in a stack are managed via a single control plane. Cisco’s 6500 series switches have a similar feature, called Virtual Switching System or VSS, which uses standard 10gb interfaces to achieve the same result. At the current time, VSS is limited to aggregating two chassis, but Cisco’s goal is to extend this to more devices. On the Nexus 7k and 5k platforms, the virtual Port-Channel (vPC) feature allows two physical devices to be logically paired together to present a common switching platform to connected devices. The important difference between vPC and the Stacking/VSS methods is that the control planes of the vPC devices are separate.
Juniper and HP both described their visions of a single control plane for the data center. Juniper went into great detail about their stacking technology (called Virtual Chassis) for fixed-configuration switches, as well as their standard Ethernet-based method for interconnecting modular switches. HP was less technical in their presentation. By my best guess, they have a VSS-style Ethernet interconnection method.
Force10’s VirtualScale technology combines the control planes of two or more switches to offer MLAG. The connections between the switches are standard 1 or 10gb links.
According to Arista Networks, their MLAG solution can pair two switches into a single logical switch. It isn’t clear from the documentation whether this feature combines the two control planes or keeps them separate. The configuration documentation is behind a paywall :(
Here’s a table of the vendors and where their solutions reside:
Proprietary Stacking 1/10gb Stacking Separated Control Planes
Arista Networks X
Cisco Systems X X X
Force10 X
Hewlett-Packard ? X (I think)
Juniper X (I think) X

My Thoughts

I am not yet comfortable with combining control planes in a data center environment. I much prefer Cisco’s vPC method of spanning-tree elimination over the stacking and VSS methods. There are several factors that contribute the this point of view. First, I was bitten by a VSS bug about 18 months ago. I suppose I should chalk that up to being an early adopter, but I guess I hold a grudge :)
Second, the shared-fate aspect of a single control plane makes me uncomfortable. When I strive to eliminate single points of failure in the data center, I look for the following items:
  1. Single-Attached Servers – If a server owner chooses to take this risk, I am not responsible for the impact of a switch or cable failure.
  2. Port-Channel Diversity – I work to ensure that single-device to single-device port-channels are built using separate modules on chassis-based switches. I also attempt to diversify the cable paths. For example, I’ll run one cable of a port-channel up the left side of a rack, and the diverse cable up the right side. If the opportunity presents itself, I’ll utilize a mix of copper and fiber in a single port-channel for an extra level of comfort, although I’ll admit that this is excessive in typical Data Centers.
  3. Power Diversity for Paired Switches – When two switches are configured as a pair (for example, when individual servers are connected to both switches), I ensure that they are powered by different PDUs, or are at least on different UPSs. if separate UPSs are unavailable, it is preferable not to have the second switch on a UPS at all. To look at it another way, I’d rather have a single switch up for 30 minutes, versus a pair of switches up for 15 minutes. While I haven’t implemented this idea in my data centers, I am intrigued by it as a method for reducing the load on our Data Center UPSs. (The same goes for servers performing duplicate functions, if sysadmins are still reading this).
  4. Control-Plane Diversity – If a single reload command can take down my entire data center (even momentarily), I don’t quite have diversity. I’ve heard the “Operator error is the cause of most IT downtime” mantra often enough for it to have sunk in, at least a bit. If the reload command doesn’t concern you, think about how a simple configuration error would no longer be isolated to a single switch.
I’ll stop the list here, but there are probably many others I haven’t listed. Feel free to mention your favorites in the comments and I’ll add them here with appropriate credit.

Tuesday, September 28, 2010

Ten Gigabit Switching Thoughts

As mentioned in a previous post, I recently attended the Gestalt IT-organized Net Tech Field Day in San Jose, CA. This event brought me back in contact with a former colleague (Terry Slattery) and a number of my podcast/blogging friends (Ethan Banks, Greg Ferro, Brandon Carroll and Ivan Pepelnjak). In addition, I met in person for the first time many of the network-related blog authors and tweeters I follow (Jeremy Gaddis, , Josh Horton, Jennifer Huber, Steve Rossen and Jeremy Stretch). Oh, and Bob Plankers was there too, but he is just a server guy ;)

Thank you all for your company and your contributions to the many technical and non-technical discussions. And a special thank you to Stephen Foskett and Claire Chaplais for organizing this event. It was an amazing feat of logistics and vendor management. I am in awe of how smoothly the event went. I often run into more trouble during my daily commute, and I work from home!

During the planning of this event, Stephen Foskett asked the attendees what they were most interested in hearing about. The plurality of the responses, including my own, mentioned Data Center technologies. The vendors did not disappoint, as no fewer than five of the seven participants focused on this area. We received briefings on data center switching technologies from Hewlett-Packard, Force10, Juniper and Arista Networks.

The goal of this post is to compare/contrast the 10gb switch offerings of these vendors. Also, because the presentations/discussions made it clear that these vendors measure themselves against Cisco Systems in both market share and feature parity, I’ll include Cisco as well. I regret not including Foundry, Extreme, Oracle/Sun and anyone else, but I do not have any firsthand knowledge of their offerings. Any mention I would make of them would be strictly web-based research. You can do that yourself. :)

(Note… I decided to delay my review of the vendors’ chassis aggregation technologies for another blog post. This one was getting too long even without it.)

Fixed Configuration 10gb Switches

In short, they all have them. Arista sort of “ups the ante” in terms of advertised performance with their 7148SX switch. It is advertised as a low-latency device suitable for High Performance Computing (HPC) and High Frequency Trading (HFT) needs. Force10 also competes in this space with their S2410 device, which is promoted as a component of the New York Stock Exchange’s network. Our Arista contact made the point of saying that their switch is not eligible to be deployed at the NYSE because their company is not listed on that exchange. Abner Germanow (@abnerg) of Juniper mentioned that their devices were used in stock exchanges as well. Most of the supplied documentation mentions Juniper's M-Series routing platforms. At least one link (http://bloga.tw/a648bU) mentions the use of Juniper's EX-series LAN switches. I also noticed the Juniper routers included in the 60 Minutes feature on HFT. Cisco and HP do not appear to be competitive in the HPC/HFT arena, although I may have missed something in my research.

Another important item to note is that each of the presenting switch vendors at Net Tech Field Day has a 10gb, fixed configuration Layer-3 switch in their portfolio. Cisco (who did not present at the event) does not yet have this available. In a meeting with Jim Capobianco of Cisco last week, I learned that the upcoming Nexus 5548 & 5596 switches will eventually have this capability. It will require the installation of a Layer-3 Forwarding Engine, and will not be available until Q1 CY011. I am surprised they’re taking so long to deliver this, as it must be the cause of lost sales opportunities. The 4900M has something to offer in this space, but it is clearly not an integrated part of the new Nexus DC approach.

Lack of Innovation in the Space

I was struck by the similarities of all the switching vendors. Perhaps it was the tight timing constraints of the Net Tech Field Day sessions (most were 2 – 4 hours, with hard stops at the end), but with the exception of Arista Network’s offering (which I’ll discuss below), all of the vendors had very similar stories. This has been noted by several of my fellow attendees (Most notably Ivan). I don’t mean to pile on the criticism, as I’m sure developing these products is very difficult in itself, but I would love to see a significant differentiator from each vendor.

Juniper’s One OS

Aside from Arista, Juniper probably did the best job of differentiation with their “One OS” discussion. Their claim is that having a unified OS across multiple switching and routing platforms reduces the OPEX. Support for this claim include:

  1. Network engineers only need to learn a single CLI
  2. Commands are common to all platforms, allowing for better configuration standardization
  3. Feature parity across all devices

Counterpoints include:

  1. Not everything is standardized in JUNOS, such as hardware-based QoS configuration
  2. Feature-set differences negate part of the feature parity claim (for example, no MPLS on LAN switches)
  3. Cisco’s assertion that purpose-built OSs are better suited for unique environments

I’m not yet sure how I feel about this one. In all honesty, I’ve not had significant trouble learning new CLIs. The feature parity argument carries a bit more weight with me, especially considering my challenges with implementing features across IOS and NX-OS. I am also sympathetic to the OS-sprawl argument, best described by Mike Morris on his Network World blog. I suppose I’ll let the industry sort this one out without my input. I’m sure they’ll manage :)

Arista… Finally Something New

I was clued into Arista Networks about a year ago, when I read that Jayshree Ullal (and later Doug Gourlay) jumped ship from Cisco to join the 10gb switching startup. An industry friend of mine also highlighted their offering to me a few months later. At some point, I got added to a sales list for the company, and for the last few months I’ve received occasional marketing materials via email (thanks Alicia!). For me, the buildup to Arista Network’s presentation was quite extensive. I did not want this influence my fellow attendees’ perceptions, so I kept my thoughts to myself during the event.

Doug Gourlay of Arista scored a lot of points with the audience when he quickly explained that Arista Networks builds Data Center switches, and nothing else. It is clear that Arista is not attempting to be all things to all people. Their switches’ TCAM has room for 16K routes, 16K MAC addresses and 16K ARP entries. They are not going to be able to hold the Internet routing table, nor is Arista attempting to sell products that could do that. Doug was quite blunt when he said that “Arista is selling to companies where IT makes money.” Such businesses include Wall Street firms, HPC opportunities (Bio-Tech and other sciences) and social media websites.

So what’s new? For one thing, the switch runs a nearly-standard version of Linux, Fedora Core 12, kernel 2.6.31 (thank you to Doug for the correction) . According to the company, only about 750 lines were changed in the kernel to support the movement of device interrupts from system space to user space. This facilitates the starting and restarting of device drivers, and protects those processes from affecting the stability of the overall system. End users can build FreeBSD-compatible programs and run them in user space within the OS. EOS, Arista’s switch operating system, normally only requires 10% of one CPU core. On their dual-core switches, this leaves 95% of the processor power to custom-written applications. We were assured that EOS receives priority, so it is unlikely that a user application would affect the stability of the switching function. This capability is a standard feature of the Arista platform, unlike the additional cost of Cisco’s NM-based machine.

A second compelling EOS feature is VM Tracer. This allows a network admin to determine what device is attached to a particular port. If it is an ESX/ESXi server, it can query the server using VMWare’s API to determine which VMs are running on it. If a VM is VMotion’d to another ESX host, the switch can detect this and move the port-profile to the new location. It would be interesting to see exactly how this feature stacks up against Cisco’s NX-OS capabilities.

Summary

There are plenty of options available for 10gb data center network builds. Cisco is likely the safe option, although I do not see that they have any compelling features that would preclude me from choosing another vendor. Long ago I learned that one of the best negotiation tactics is to find two (or more) solutions you would be happy to deploy, then let both vendors know it. This will often get you the lowest price for your project. For the last few years, this has been difficult to do, since Cisco has done a relatively good job of innovating in the data center space. My recent Net Field Tech Day experience has shown me that there are other options that meet or exceed Cisco’s performance specs, so maybe it is time to search for competitive bids.

The Arista Networks presentation also demonstrates that other vendors are not standing still. Arista appears to be in a unique position to be able to price their devices at a premium. As a market strategy, Cisco and the other switch vendors need to begin innovating to put themselves in a similar position. (Hint… FCOE is not the answer). For Cisco, UCS is a potential driver of network equipment sales, but what about the other vendors?

(Disclaimer – Arista Networks, Force10, Hewlett Packard, and Juniper were sponsoring organizations of this event. There is no obligations for me to write anything about these companies or the other participants in this event. So while these musings came out of a sponsored trip, they are assuredly my own thoughts.)

Friday, September 17, 2010

The Ascendency of Hewlett Packard?

On Thursday I sat through several hours of Hewlett Packard presentations as part of the Net Tech Field Day program. Is there finally a viable competitor to Cisco across multiple product lines?

(Disclaimer – Hewlett Packard is one of the sponsoring organizations of this event. There is no obligations for me to write anything about HP, or the other participants in this event. So while these musings came out of a sponsored trip, they are assuredly my own thoughts)

Through its recent acquisition of 3Com, Hewlett Packard has acquired some fairly impressive networking technology. The A-Series switch line stacks up well to Cisco’s Nexus product line. If you read the following marketing description, can you determine if this is a description of the Nexus 7K platform or of the HP A-12500? I don’t think I could:

Next-generation, large core/data center switching platforms with innovative Intelligent Resilient Framework (IRF) technology. 18- and 6-slot switches with up to 6.6 Tbps performance and up to 128 (1:1) or 512 (4:1) 10GbE ports, and 864 GbE ports. Non-blocking, zero service interruption design, and architecture support for 40 GbE and 100 GbE. Wire speed L2/IPv4/IPv6/MPLS.

Except for the mentioning of IRF and the slot count, it sounds like the Nexus 7K to me. (Reference http://h10144.www1.hp.com/products/switches/index.aspx#A12500).

Unfortunately HP didn’t provide any tangible information about how this switch looks and feels in a predominantly Cisco network. The Tech Field Day participants pleaded for some white papers and/or deployment guides, but none are available at this time. In a somewhat ominous sign, the A-Series presenter mentioned that a design guide had been written, but HP has not yet determined if it would be treated as proprietary information for HP’s services organization.

Takeaways

I have two main takeaways from the HP portion of Tech Field Day:

1) Cisco may have a viable competitor across their networking product line.

If the industry rejects new Cisco-proprietary protocols in the Data Center (L2MP, VNTag), HP and other DC switching vendors (Arista, Force10, Juniper) will be able to compete directly on “speeds and feeds.” This would not be good for Cisco. Once a company goes multi-vendor, Cisco will lose its pricing power across the majority of its product lines. This is a good thing for the industry. Buying 3Com makes that equipment line a viable solution for businesses. I did not have enough faith in 3Com as a standalone company to recommend their gear for my network. (As an aside, I do not have faith in Foundry, even after being acquired by Brocade)

What also must be said is that HP is taking a huge risk here. My organization is building a greenfield data center soon. Historically we’ve gone with HP for the majority of our compute power (C-class blade chassis) and Cisco for most of the network components. If HP wants an opportunity to bid for the network side of this project, we’ll also need to open the compute side up to Cisco. All’s fair, right? The issue with this is that the network opportunity is approximately one-third the size of the compute opportunity. HP is effectively risking X for a shot at X * 1.33, while Cisco is risking Y for a shot at 4 * Y.

2) HP’s large services organization could significantly hinder their ability to gain traction in the marketplace.

There are basically two types of resellers in the network marketplace. There are those that compete on price and attempt to shave a few percentage points of margin off every sale, and there are those who make their margins based on consulting or management services that are sold along with networking equipment. My organization generally prefers the former, as we are comfortable with our ability deploy complex networking solutions. Price is usually our deciding factor for equipment purchases, although we have made exceptions for new technology, such as WAN acceleration and Cisco’s ACNS video platform. The problem with utilizing the cheapest provider is that the purchaser and networking equipment vendor has to take responsibility for the network architecture, product selection and installation.

For organizations that have less network know-how, purchasing from resellers that can deliver the installation and management services required is a more attractive option. They can provide network designs and product selection advice. If the reseller has agreements with multiple equipment vendors, they can help determine which vendors' equipment best fits the customer’s needs. As a network engineer, these are the resellers I want to see in the marketplace, because they offer interesting jobs.

So what does this have to do with HP? As mentioned above, HP isn’t sure whether their deployment guides will be released to the general public, or for that matter, even to their resellers. This is ridiculous. If HP wants the world to adopt their product line, they need to do everything possible to educate the engineers responsible for deployments. If HP’s plan is to have their own services organization take the bulk of HP network deployments, they’ll lose all the value-added resellers. Those are the resellers they need the most, since they are the ‘trusted advisors’ for businesses. Without their support, HP will be forced to identify and win every networking deal on their own.

Wednesday, August 25, 2010

HSRP, vPC and the vPC peer-gateway Command

My recent post concerning my Migration from Catalyst to Nexus received a number of interesting and helpful comments.  One comment from routerworld caused me to do a bit of research into the “vpc peer-gateway” command.  This blog post is a summation of that research.

How HSRP Works

Hot Standby Routing Protocol is a well-known feature of Cisco IOS.  The goal of HSRP is to provide a resilient default-gateway to hosts on a LAN.  This is accomplished by configuring two or more routers to share the same IP address and MAC address.  Hosts on the LAN are configured with a single default-gateway (either statically or via DHCP).
Upon sending its first packet to another subnet, the host ARPs for the MAC address of the default gateway.  It receives an ARP reply with the virtual MAC of the HSRP group.  The IP packet is encapsulated in an Ethernet frame with a destination MAC address of the default gateway.  If the primary router fails, HSRP keepalives are lost, and the standby HSRP router takes over the virtual IP address and MAC address.  The host does not need to know that anything has changed.

In the diagram above, the user (10.1.1.100) is configured with a default-gateway of 10.1.1.1.  When the user sends its first packet to 10.5.5.5, it ARPs for 10.1.1.1.  In my example, Router A is the HSRP primary router, so it sends an ARP reply with the virtual MAC address of 0000.0c07.AC05.  The User PC then encapsulates the IP packet (destination IP=10.5.5.5) in an Ethernet frame with a destination MAC address of 0000.0c07.AC05.  Router A accepts the frame and routes the packet.
The above paragraphs tell the story of packets coming from the HSRP-enabled LAN.  But what happens to reply packets coming from 10.5.5.5 to 10.1.1.100?  The answer is simple, and intuitive if you follow step-by-step.  First, the Server creates an IP packet with a destination of 10.1.1.100.  It encapsulates it in an Ethernet frame and forwards it to its default gateway (for this example, let’s say it is Router A).  Router A strips the Ethernet framing and determines the next hop is on the local subnet 10.1.1.0/24.  It encapsulates the packet in an Ethernet frame with a MAC address of 0021.6a98.1952.  The source MAC address is the physical MAC address of Router A (0024.F71E.3343).  Router A does not use the virtual MAC address for packets it routes onto the local subnet.

So, what is vPC?

Now that we’ve covered HSRP, let’s talk about Virtual Port Channeling (vPC).  vPC allows two NX-OS devices to share a port-channel.  Attached devices believe that they are connected to a single device via an etherchannel bundle.  This is great because it eliminates spanning-tree blocking along parallel paths.
To allow this to work, the paired NX-OS devices use two vpc-specific communication channels.  The first is a vpc peer-keepalive message.  This heartbeat lets one switch detect when the other has gone off-line, to prevent traffic from being dropped during a failure.  These are lightweight hello packets.
The second communication channel is the vpc peer-link.  This is a high-speed connection between the two NX-OS switches that is used to stitch together the two sides of the port-channel.  If a frame arrives on switch A, but is destined for a host on switch B, it is forwarded across the peer-link for delivery.  All things being equal, it is undesirable to forward frames across a vpc peer-link.  It is much better for the frame to be sent to the correct switch in the first place.  Of course, there’s no way for the attached device to know which path is more appropriate.

In the above example, the User PC is sending an Ethernet frame to the Server.  It creates a frame with a destination MAC address of 0033.9328.12A1 and sends it to the L2 Switch.  The L2 switch has an entry in his forwarding table indicating that the destination MAC is accessible via the Port-Channel 100 interface.  It uses its etherchannel load balancing hash algorithm to determine which physical interface to forward the frame onto.  It is equally likely that it will choose the link to Nexus B, even though the more efficient path is to Nexus A (someday TRILL will help us, but for now there is no solution).  If the frame is sent to Nexus B, it will forward the frame over the vPC peer-link to Nexus A.

Cisco’s current recommendation is to build the vPC peer-link with multiple dedicated 10GE connections for performance reasons.  Cisco also recommends that all devices in a vPC-enabled VLAN be connected to both Nexus switches.  In the diagram above, the Server is considered to be a vpc orphan port.  This is undesirable, since it requires usage of the vpc peer-link.  It also has implications with multicast traffic forwarding.

vPC and HSRP Together

Now we’ve arrived at the point where we can pull all this information together.  In the following diagram, the User PC has been moved to a new VLAN.  The user is again trying to communicate with the server.

The User PC ARPs for his default gateway.  Nexus A (the HSRP primary) replies with the virtual MAC address of 000.0C07.AC05.  The user creates an Ethernet frame with a destination address of the virtual MAC.  It then forwards the frame to the L2 Switch.  The L2 Switch uses its etherchannel load balancing algorithm to determine the physical link to use.  The difference is now that it doesn’t matter which link it uses.  The NX-OS switch on the other end will accept and route the packet.  In effect, both Nexus switches are HSRP active at the same time.  This is eliminates the need to forward Ethernet frames across the vPC peer-link for packets that are destined for other subnets.

What Does “vpc peer-gateway” Do?

If we left everything alone, the story would be complete.  Unfortunately, storage vendors thought it would be a good idea to optimize their handling of Ethernet frames.  Some NetApp and EMC equipment ignores the ARP reply given by the HSRP primary and instead forwards Ethernet frames to whichever MAC address it receives frames from.  This is nonstandard behavior.
Using the diagram above, let’s assume say that the User PC is now a EMC Celera storage device.  The Server sends its packets (IP destination 10.1.1.100) to Nexus B, which routes them to the Ethernet LAN.  All IP packets with source IP 10.5.5.5 will be encapsulated in Ethernet frames with a source MAC address of 0022.5579.F643.  The EMC Celera will cache the source MAC address of these frames, and when it has IP packets to send to 10.5.5.5, it will encapsulate them in Ethernet frames with a destination MAC of 0022.5579.F643.  It is choosing to ignore its default gateway for these outbound packets.
I suppose the theory behind this feature was to eliminate the extra hop within the LAN.  When HSRP is enabled, it is necessary to disable ICMP redirects.  This means that the routers will not inform hosts on the LAN that a better default-gateway is available for a particular destination IP address.  This storage feature saves a LAN hop.
Unfortunately, this optimization does not work well with vPC.  vPC relies on virtual MAC address sharing to reduce utilization across the vPC peer-link.  If hosts insist on addressing their frames to a specific router, suboptimal packet forwarding can occur.  According to Cisco, “Packets reaching a vPC device for the non-local router MAC address are sent across the peer-link and could be dropped by the built in vPC loop avoidance mechanism if the final destination is behind another vPC.”  At the application level we saw very poor performance due to these dropped packets.  Enough of the packets got through to allow access to the storage device, but file load times were measured in the tens of seconds, rather than milliseconds.
The “vpc peer-gateway” allows HSRP routers to accept frames destined for their vPC peers.  This feature extends the virtual MAC address functionality to the paired router’s MAC address.  By enabling this feature, NX-OS effectively disables the storage vendors’ optimization.

Conclusion

If you are running vPC and HSRP, and you have EMC or NetApp storage equipment, you probably need to add the “peer-gateway” command under your vpc configuration.  The only caveat to peer-gateway is the following (from NX-OS 5.0 - Configuring vPC):
Packets arriving at the peer-gateway vPC device will have their TTL decremented, so packets carrying TTL = 1 may be dropped in transit due to TTL expire. This needs to be taken into account when the peer-gateway feature is enabled and particular network protocols sourcing packets with TTL = 1 operate on a vPC VLAN.
I have yet to face this issue, so my recommendation is to add this to your vpc configuration as a default.