Wednesday, 11 June 2014

pfSense: Auto reboot if Internet connection lost

I use pfSense as both my router and my firewall.  If the ADSL line drops, normally it will come back up a minute later with no problem. But a few times per year, the ADSL line comes back up in a funny state, and I end up having to reboot the pfSense box manually to recover.  So, we can run a script from cron to fix that...
Based on a great post here on the pfSense forums...

1. Login to pfsense with serial console or SSH (you may need to enable SSH in the web GUI first). Then  select "8" for shell command.

2. To remount file systems as read-write, run: /etc/rc.conf_mount_rw

3. Use vi editor to create /usr/local/bin/ping_check.sh  as follows...
#!/bin/sh
# First sleep for 6 mins so that we don't run this code if box has only just booted and PPPoE not up yet.
/bin/sleep 360

# Try 12 mins worth of very short pings to googles DNS servers...
# Quit immediately if we get a single frame back.
# If neither server responds at all then reboot the firewall.
# Otherwise, flip WAN interface up and down again and try again.
# Use conservative timers so that we don't reboot unless outage lasts more than 20 minutes.
# Also we don't reboot more than a couple of times per hour.
# This is to counter risk of box getting in funny state during reboot cycle and needing power cycling...

counting=$(/sbin/ping -o -s 0 -c 360 8.8.8.8 | /usr/bin/grep 'received' | /usr/bin/awk -F',' '{ print $2 }' | /usr/bin/awk '{ print $1 }' )
if [ $counting -eq 0 ]; then

     counting=$(/sbin/ping -o -s 0 -c 360 8.8.4.4 | /usr/bin/grep 'received' | /usr/bin/awk -F',''{ print $2 }' | /usr/bin/awk '{ print $1 }' )
     if [ $counting -eq 0 ]; then

        # network down
        # Try flipping WAN NIC in the hope that will trigger BT modem to reconnect properly ...
        /sbin/ifconfig vr3 down
        /bin/sleep 10
        /sbin/ifconfig vr3 up
        /bin/sleep 60

        counting=$(/sbin/ping -o -s 0 -c 360 8.8.8.8 | /usr/bin/grep 'received' | /usr/bin/awk -F',' '{ print $2 }' | /usr/bin/awk '{ print $1 }' )
        if [ $counting -eq 0 ]; then

            counting=$(/sbin/ping -o -s 0 -c 360 8.8.4.4 | /usr/bin/grep 'received' | /usr/bin/awk -F',' '{ print $2 }' | /usr/bin/awk '{ print $1 }' )
            if [ $counting -eq 0 ]; then

               # network STILL down
               /etc/rc.stop_packages
               /etc/rc.reboot

            fi
        fi
    fi
fi
4. chmod +x /usr/local/bin/ping_check.sh

5. To mount as read-only again, run: /etc/rc.conf_mount_ro

Now you need to add a cron job to automatically run this every 10 minutes.

6. Go into pfSense web interface - and select:
- Packages (under System)
- Cron (0.1.8 is what I found when writing this)
- Select "+" and install Cron.   

7. Then go into Cron (under Services)


8. Click "+" and add 

minute:  */10      (if you want to run every 10 minutes)
hours:    *
mday:    *
month:   *
wday:     *
(who):   root
command:   /usr/local/bin/ping_check.sh 

Click "Save"

Thats it! 


Test by unplugging your ADSL line. The box should reboot after about 30 minutes or so.  You could adjust the timings but I didn't want my box rebooting more than a few times per hour at most.  And I wanted the auto reboot to happen slowly enough that I have time to get in and administer the firewall, if the ADSL line is dead (and maybe I want to set up a 3G cellular backup connection).


9 comments:

  1. Thanks for this very useful note. I set it up as you said and works fine, since we have trouble from time to time with our comcast gateway and i can't always send someone in in the middle of the night. PROBLEM SOLVED!!! Thanks

    ReplyDelete
  2. That helped me a lot... I just preferred to past the whole script via the "Edit File" feature on pfsense :-)

    ReplyDelete
    Replies
    1. Thanks your for note Jan! That's reminded me that I need to put that script back in place, as I've recently rebuilt my firewall from scratch on new hardware.

      Delete
  3. I have the script adjusted a bit to run it more frequent...


    # Testing uptime to run script only xx seconds after boot

    # Current time
    curtime=$(date +%s)

    # Bootime in seconds
    uptime=$(sysctl kern.boottime | awk -F'sec = ' '{print $2}' | awk -F',' '{print $1}')

    # Uptime in seconds
    uptime=$(($curtime - $uptime))

    # If boot is longer than 120 seconds ago...
    if [ $uptime -gt 120 ]; then

    # A message to the console (I like feedback)
    echo "Testing Connection at" `date +%Y-%m-%d.%H:%M:%S` "uptime:" $uptime >> file.txt
    wall file.txt
    rm file.txt

    # Try 1 or 2 minutes worth of very short pings to googles DNS servers.
    # Quit immediately if we get a single frame back.
    # If neither server responds at all then reboot the firewall.

    counting=$(ping -o -s 0 -c 10 8.8.8.8 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }' )

    if [ $counting -eq 0 ]; then

    counting=$(ping -o -s 0 -c 10 8.8.4.4 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }' )

    if [ $counting -eq 0 ]; then

    # network down
    # Save RRD data
    /etc/rc.backup_rrd.sh
    reboot

    fi
    fi
    fi

    ReplyDelete
    Replies
    1. Interesting... The reference to file.txt failed for me (using the NaneBSD image for a headless firewall) but it might work as /tmp/file.txt .

      Delete
  4. PS: I didn't bother installing the GUI Cron package this time (fresh install of pfSense 2.2.4) - I just added my cron job to the bottom of /etc/crontab. But I can believe that my next in-place version update of pfSense might blow away my /etc/crontab entry. I will have to check in due course.

    ReplyDelete
  5. The tiniest of improvements.... makes my life much easier


    # Testing uptime to run script only xx seconds after boot

    # Current time
    curtime=$(date +%s)

    # Bootime in seconds
    uptime=$(sysctl kern.boottime | awk -F'sec = ' '{print $2}' | awk -F',' '{print $1}')

    # Uptime in seconds
    uptime=$(($curtime - $uptime))

    # If boot is longer than 120 seconds ago...
    if [ $uptime -gt 120 ]; then

    # A message to the console (I like feedback)
    echo "Testing Connection at" `date +%Y-%m-%d.%H:%M:%S` "uptime:" $uptime "seconds" >> file.txt
    wall file.txt
    rm file.txt

    # Try 1 or 2 minutes worth of very short pings to googles DNS servers.
    # Quit immediately if we get a single frame back.
    # If neither server responds at all then reboot the firewall.

    counting=$(ping -o -s 0 -c 10 8.8.8.8 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }' )

    if [ $counting -eq 0 ]; then

    counting=$(ping -o -s 0 -c 10 8.8.4.4 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }' )

    if [ $counting -eq 0 ]; then

    # trying to just restart NIC

    ifconfig hn0 down
    ifconfig hn0 up

    counting=$(ping -o -s 0 -c 10 8.8.8.8 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }' )

    if [ $counting -eq 0 ]; then

    # network down
    # Save RRD data

    /etc/rc.backup_rrd.sh
    reboot

    fi
    fi
    fi
    fi

    ReplyDelete
  6. Nice idea to try flipping the NIC down and up again... I might add that to my setup with a 1 second sleep between. In the hope of resetting the fibre modem.

    ReplyDelete
  7. Thanks for all the info/contributions here...

    Note the second "counting" line has an error in the post, as in it is missing a space within the "/usr/bin/awk -F',''{ print $2 }'" section...

    should be...

    counting=$(/sbin/ping -o -s 0 -c 360 8.8.4.4 | /usr/bin/grep 'received' | /usr/bin/awk -F',' '{ print $2 }' | /usr/bin/awk '{ print $1 }' )

    ReplyDelete

Spammers: please stop wasting my time. All comments are moderated before publication.