Defaultroute - Technical Guides: May 2011

Monday, 23 May 2011

Securing access to JunOS - Part 1

So we've got to secure our Juniper router from un-authorised access attempts. There are a couple of ways we can access the device for configuration; console, telnet, SSH, HTTP(s) and NMS to name the main ones. Each of these however *needs* a username and password to permit access before we get in, we need it for authentication.

So lets go ahead and configure a username and password for our router. We've decided to be totally cliche and use the name 'admin' for our administrator account. First we need to get into configuration edit mode by entering the configure keyword from 'exec' mode.

root@R1> configure

Now lets enter the 'system' command tree (which contains the login 'features') by entering:

root@R1# edit system

Now we can add the user:

Whoa! What went wrong here? Well it's simple really, JunOS saved my arse. The password I chose to input here was 'junos'. Sure it wasn't a very secure password but I figured it was for a demo so who cares? Well JunOS cares. It is very kindly reminding you that the password you put in was poor and didn't pass the restriction guidelines of minimum password length of 6 characters (junos is 5 of course) AND that it didn't contain a mixture of either case, numbers or punctuation. So lets do that again and choose the password JunOS1!

All good. Now you know what - our company guidelines are more strict than that. We need to have a minimum of 8 characters and we need to also make sure that the new passwords being set have at least 3 characters of difference from the last. That is to say if my current password if PassW0rd! then my new password cannot be PaSSW0rd! because I only changed two of the letters to uppercase.

So lets put in the configuration to get the company policy met:

OK, so user set, password right, now lets put our user into the right security group. Admin is clearly an administrator and so we'll put that user into the highest security group with the most rights. Note that this command has been entered from the 'top' of the configuration tree rather than from inside the 'system' branch:

Right now we need to enable the services themselves. Out of the box a fresh JunOS configuration has no services enabled. Lets turn on Telnet and SSH.

Now we look back at our company security policy and notice that we are told to exclude 'root' user access from SSH enabled devices. We can do this in JunOS...well done eh...

So now the only user configured outside of 'root' is 'admin' so we'll be using admin to access the device. All good so far. But hey, we've got a router here on the network...anyone can access it. What we really need to do now is restrict access to it. Now on a Cisco IOS device we'd be talking about an access-list. We'd write the list and then apply it to an interface (vty for telnet/ssh). Well JunOS is just the same except, like always, JunOS does things a little differently.

Lets start be creating a firewall filter (comparable to Cisco access-list). First enter the 'firewall' edit mode, notice we first skip back to the top of the configuration tree:

Lets setup our filter to allow access from hosts in the 192.168.1.0/24 network. Forgive the screenshot the syntax goes thus (remember we are already at [edit firewall] level:

root@R1# set filter ssh-and-telnet-only term source-hosts from source-address 192.168.1.0/24

So thats the network access limited, and now we'll make sure this filter is matching traffic destined for telnet and ssh ports only. Remember SSH runs on port TCP/22 and Telnet on TCP/23 but we can match those using the application name. Again, the screen host has truncated and the beginning is 'set filter ssh-and-telnet-only term sour.....'

Finally we need to 'accept' this traffic. This is like the IOS 'permit' keyword.

So what do we do with the access which is denied? Well those sessions are dropped but wouldn't it be nice to see how many we're dropping? Lets add to our filter now with the 'count' keyword followed by a 'reject'. Both of these statements are called last in the filter. In IOS the reject would be implicit as a 'deny' at the end but like IOS if we want to log access attempts we need to add the 'deny log' statement at the end too...so not too dissimilar eh?

Right, access-list done now but we need to bind it to an interface. Just like in IOS where you would use an access-group or access-class for VTY ports in JunOS we simply add the 'filter' to the interface. Again, the screenshot it truncated:

root@R1# set interfaces em0.0 family inet filter...

OK, lets try a telnet now.

Now lets try an SSH. First attempt causes my unix based host to prompt to add the SSH key signature to my known_hosts file...the second attempt has no such issue...we login...done.

Thanks for reading

Friday, 20 May 2011

Juniper EX Switch firmware upgrade process

Juniper EX switches come with two separate flash partitions a root (default boot) and a copy of that root in another piece of flash memory. Each partition contains a copy of the boot software and the configuration. Now we've deployed a number of EX clusters and from time to time we notice that sometimes the secondary partitions (the non-active one) doesn't get upgraded with the active one. This obviously causes problems if the active partition is corrupted and wont boot. Manually booting the secondary to get the switch up doesn't help if it's in a cluster because that node is then marked at NotPrsnt (Not Present) due to the fact it won't be running the same software as the other nodes.

So, luckily Juniper have come to the rescue and brought out the latest 10.4 firmware to do the following (source: www.juniper.net)

Resilient dual-root partitioning, introduced on Juniper Networks EX Series Ethernet Switches in Junos operating system (Junos OS) Release 10.4R3, provides additional resiliency to switches in the following ways:

Allows the switch to boot transparently from the second root partition if the system fails to boot from the primary root partition.
Provides separation of the root Junos OS file system from the /var file system. If corruption occurs in the /var file system (a higher probability than in the root file system due to the greater frequency in /var of reads and writes), the root file system is insulated from the corruption.

Great news this. So lets upgrade our firmware and get this great feature. Here is a copy of todays firmware version:

We need to get hold of the latest firmware from Juniper's website so we download that...but there was also an issue with our Jloader...it was too old. We will need to delete the old loader and upgrade that as well as the firmware (Jloader Upgrade Link). Good news is we can do both then reboot to cut down on reboot time. We've downloaded the jloader and junos image (10.4R3 is recommended at the time we wrote this).

First lets just check we can ping the FTP server holding the firmware images. We're using FileZilla server for this and we've created a new user call 'junos'. You can get hold of FileZilla server here

Looks good, right lets go with the upgrade.

Jloader first. The ftp load command uses the syntax 'request system software add ftp://10.10.15.23/jloader-ex-3242-11.3I20110326_0802_hmerge-signed.tgz'. This is basically saying we're using FTP to get the image and that the image is located on server with IP address 10.10.15.23. Without stating a username and password int he form ftp://username:password@ the JunOS parser will use the default username of 'anonymous' with no password. Some FTP servers come with built in anonymous support...FileZilla needs you to create a user called 'anonymous' with the password checkbox unchecked. HEre is the process:

Each node in the cluster will be upgraded in turn until it finishes and returns you back tot he prompt:

The FileZilla management console shows you the whole process as the file is pulled back to the cluster master node. Here is a brief screen shot of the download:

Right, thats the jloader upgrade bit applied (but not yet active until we reboot. To save time we're now going to upgrade the firmware so that we only do one reboot. Here is the process and remember, this is a 4 node cluster as shown by the fpc0,1,2,3. If you have a larger node number then your output will be different.

So, just like the man said 'A reboot is required to install the software'. Let us oblige...

It took about 5 minutes to come around again, I logged in and checked the firmware versions and loader.

Thats OK now I checked the state of the partitions

Looks like I have two partitions there active/backup....all looking pretty sweet. Lets get some more information on the state of those partitions...detail?...nah thats what they would expect you to do lets look at the snapshot...

We're upgraded, we've got two healthy partitions...just need to wait for a failure now to see it automatically fix itself...but I won't wish for that. I think one question we're all asking is how do I find out if there has been a partition failure if it fixes itself?

Well you've got console logs, syslog and SNMP...take your pick. From the management port you will see

WARNING: THIS DEVICE HAS BOOTED FROM THE BACKUP JUNOS IMAGE

You can of course always look at the chassis alarms.

user@switch> show chassis alarms
1 alarms currently active
Alarm time Class Description
2011-02-17 05:48:49 PST Minor Host 0 Boot from backup root

Thank you for reading and may all of your upgrades be as sweet.

Thursday, 19 May 2011

BGP Origin, what it means to be i,e,?

BGP has a number of metric affecting values, one of the most often ignored is origin although it can have a serious effect on your routing table. Imagine getting two routes from two eBGP neighbors. ISPA is connected to CUSTOMER_Z using OSPF and redistributes those learned networks into BGP then advertises those to it's other upstream peers. ISP_B on the other hand is also connected to the same CUSTOMER_Z however ISPB decides to bring those OSPF networks into the BGP RIB using the 'network' statement. The upstream BGP peers will receive two routes into their RIB one with an origin of i (the one from the ISP_B) and one with origin ? (the one from ISP_A).

BGP uses a number of 'tiebreakers' to decision the best route to a destination and here are the first 5; Weight (Cisco proprietary), Local Preference, AS-PATH length, Origin, MED. So origin comes right after AS-PATH length and actually is the second highest decision maker for your outbound traffic...it really matters.

So in this tech guide we're going to build a small 3 mode network. The diagram below shows two BGP autonomous systems. 65001 contains R1 and R2 and they are connected with the network 12.1.1.0/24. R3 is in another AS with number 65002. The network between R2 and R3 is 13.1.1.0/24

Here is the output from R1 showing first the ip routing table and then a 'show run' for the interface connected to R2

Here is the same output for R2. Note that R2 is connected to R1 via Fa0/0 and R3 via Fa0/1

Finally the same output for R3

OK so we have IP comms between R1, R2 and R3. Now we're going to configure the BGP relationships.

First we'll add the neighbor statement on R1 for R2 (remember we already enabled the BGP process by issuing the 'router bgp 65001' global configuration command.

Now we do the same for the other side of this neighborship on R2 pointing to R1

Note that both have the same AS# because they are iBGP neighbors. We see now that the neighbor relationship is 'Established'

OK so now lets setup the eBGP relationship between R2 and R3. Firstly on R2

Now on R3

Note the difference in AS# between that configured in the 'router bgp' and the 'remote-as' this is an eBGP relationship. We get this console message on R2 - the neighborship is Established.

So lets create our first 'network' to advertise via BGP from R3. We'll just use a loopback interface, here it is loopback0.

We bring this into BGP using the 'network' statement.

Now on R2 we can see that R3 has indeed sent us a route for 150.1.1.0/24. Notice the origin for the route is 'i' or internal. Thats because we brought the route into the BGP RIB (Routing Information Base) using the 'network' keyword.

So what happens if we bring the network in via a redistribution? We'll create another loopback interface on R3 called loopback1 and redistribute that connected interface into the BGP RIB.

Firstly the interface configuration

Now we add the necessary configuration to the bgp routing process. Firstly we need to be careful here. simply redistributing all connected interfaces will also bring our loopback 0 interface in...so lets use a route-map. We'll match the network for loopback 1 using an access-list

Here is the access list (we could use the 'host' keyword instead of 0.0.0.0)

Now the route-map. We're matching access-list 1 which is 200.1.1.0/24 which is the network we configured for interface loopback 1. Any other match is denied and dropped i.e. it won't be redistributed.

So lets have a look at the routing table on R2 to see if we now how two routes, the first from 150.1.1.0 and now a second for 200.1.1.0.

We do! Thats great. Now look at the origin. The first was an 'i' for internal which you remember was brought in using the 'network' keyword. Now the new network 200.1.1.0 has an origin of '?' which means incomplete. The reason for this is we don't know the source of the route so our knowledge of it's origin is...well incomplete. All we know is that someone brought it in from somewhere.

So there is one more origin type and that is 'e' or EGP. Now EGP is a legacy protocol and I've never come across it. To create the 'e' origin type therefore we'll have to 'fudge' it. We're going to use a route-map again to set this.

First lets create another loopback interface on R3 for this EGP candidate route.

OK, so now lets edit the existing route-map to add in our EGP configuration. What we'll do is again using an access-list match the interface lo2. If it matches then change the origin to e and set the source AS# for the EGP to 65003.

OK, now thats done we'll wait for a while till the BGP routes settle (or we can force it with a 'clear ip bgp *' at either side of the peering).

There we go...all three origin types. If these were duplicated routes learned from different sources with the same AS_PATH length we'd choose i first, then e then ?.

So what about R1? You're right we didn't even use this yet...lets take a look at it's routing table...

Nothing? Of course - is the next hop in my routing table? No of course not so the route goes into the BGP RIB but is inaccessible. So we need to set the 'next-hop-self' on R2 to change the next hop to R2's fa0/0 interface.

OK lets see the routing table now.

Looks good - of course R3 has no way of knowing how to get to 12.1.1.0/24 yet but you know how to do that now right? Of course you do...

Thanks for reading.

Sunday, 15 May 2011

CCIE lab shortcuts - tclsh

Well, being as we all know lab time is precious we're all looking for CCIE shortcuts. Certainly in my labs (passed third time) I found the last hour was usually spent pinging everything to make sure I had that special full routing table. So, you've seen those practice labs right? You've probably gotten hold of a training package from one of the big Cisco training vendors like InternetworkExpert or IPExpert right? Well those full labs they throw at you are pretty extensive and contain a lot of IP addresses. If we were to assume that the real lab has just as many IP addresses then how could you possibly remember all of those. Not only that but could you really be up to the task of pinging each and everyone from each and every layer 3 device on the network to test for full reachability?

Well of course not, that would be nuts. So here is what I did for my lab (frankly I only had enough time to exploit this in my third attempt) and I hope it can help you out. Of course you could always use this on your production network as part of some alarm script or reachability script but I digress.

So, tclsh, what is it and how do we use it? Well it's a very powerful application in it's own right, read about it here. On IOS it runs under it's own binary and is called from privileged EXEC mode. The power of tclsh is not in the scope of this document and honestly I've not even scratched the surface myself. We're going to concentrate on running the application, writing a basic script, executing the script and using it to debug issues in your lab before the time runs out.

So first off how do we call tclsh? From privileged EXEC mode type 'tclsh' then press return

You see you get dropped into a sub-prompt or actually the TCLSH command shell. It is inside here that you can create your script. For our example here I've built up a small three node network. R1, R2 and R3 are connected together using subnets 10.1.12.0/24 (R1<->R2) and 10.1.23.0/24 (R2<->R3) and we are running EIGRP across all three. Each has a local loopback0 interface. Lo0 on R1 is 1.1.1.1, Lo0 on R2 is 2.2.2.2 and Lo0 on R3 is 3.3.3.3.

Here is a diagram so we can see the topology:

Screen shot 2011-05-15 at 18.56.02

eigrp process ID 100 is running on each router. Here is the routing table on R1:

Screen shot 2011-05-15 at 18.58.02

Ok, so in the lab you're going to be busy, too busy to remember IP addresses and *way* to busy to type these all in. Here is a tip. As you go through the lab have notepad open and type in the IP addresses for all of your nodes. I'd use notepad a lot if I were you. There was a study done (can't recall the publisher sorry) which stated that more than 60% of exam (not just CCIE) error was down to mistyping or human errors (the technical term for this of course is "fat fingers"). So why would you set yourself up for this sort of thing - put it in notepad and minimise your risk.

Here is Fred who could press QWASEDZXC with one finger easily.

fredfatfinger

So anyway you've got a list of IP addresses - this is perfect. At the end of your lab you're going to be ahead of the game when it comes to troubleshooting with tclsh!

The basic setup here for our tclsh script is we are going to run an iterated 'for' loop. For any programmers out there you'll be aware of 'for' loops. The idea behind a loop is to reduce the number of lines of code for repetitious process. e.g. "check A looks like B" or "doesn't look like B", or maybe "is smaller than B", or "bigger than B" etc. the point is that whatever the differentiator is between A and B do it, and do it again until the end...whatever the end is. In our ping script we're not doing any comparison we're just being iterative through a list of address. Hmm this is getting hard to put down in words I guess so here is our example:

We've got a number of IP addresses in our topology and these are in our notepad list:

1.1.1.1
2.2.2.2
3.3.3.3
10.1.12.1
10.1.12.2
10.1.23.2
10.1.23.3

Cool. Right now the syntax for our for loop starts with a 'foreach VARIABLE_NAME'. Now the 'VARIABLE_NAME' can be anything but it's usual to pick something small like a letter e.g. a or b, most guides I've seen us 'i', but it could easily be 'ipaddress' or even 'ccie'...you choose. The VARIABLE_NAME is just something you are associating with your IP addresses i.e. the first IP address in your list, for the first run through the loop will associate with the letter 'a' or in other words a=1.1.1.1.

The tclsh script ends by telling IOS what command it is supposed to run against the variable in the script. Our script is written to 'ping' so clearly we'll need to have ping in there somewhere ;-)

Here is the command based on us choosing a VARIABLE_NAME of 'a'. If you chose something else here then you should substitute it.

"ping $a repeat 1"

Lets walk through this. The tclsh parser will send the string "ping $a repeat 1" into the privileged EXEC mode just as if you were typing it. Now remember $a takes on the IP address for each iteration through the loop so first round $a=1.1.1.1 so now the line makes more sense "ping 1.1.1.1 repeat 1". We are saying repeat 1 because we don't need IOS to do the usual 5 pings...we're running low on time right?! So we rung one ping and we look for a "!" which means all good. Now some people tell me "Hey, sometimes the first ping is lost because the router needs to lookup the MAC address and it times out". Well I say use whatever works best for you. In the lab it's likely your routers will already have the MAC addresses they needs due to routing protocols and other troubleshooting you already did so one ping should be enough.

Right, we've been through the logic behind the script. Here is is being used.

Screen shot 2011-05-15 at 19.20.45

Thats it - all pinged - I guess I passed ;-)

The tclsh application is way more powerful than this but for an exam tool to save you some time this is probably as hard as you need to go into it. For a more in depth study of tclsh in IOS and how powerful it can be for operation use then I can recommend this Cisco Press book.