Josh Rendek

<3 Ruby

2 Patterns for Refactoring With Your Ruby Application

When working on a rails application you can sometimes find duplicated or very similar code between two different controllers (for instance a UI element and an API endpoint). Realizing that you have this duplication there are several things you can do. I’m going to go over how to extract this code out into the query object pattern 1 and clean up our constructor using the builder pattern 2 adapted to ruby.

I’m going to make a few assumptions here, but this should be applicable to any data access layer of your application. I’m also assuming you’re using something like Kaminari for pagination and have a model for People.

dummy_controller.rb
1
2
3
4
5
6
7
8
9
10
11
def index
  page = params[:page] || 1
  per_page = params[:per_page] || 50
  name = params[:name]
  sort = params[:sort_by] || 'last_name'
  direction = params[:sort_direction] || 'asc'

  query = People
  query = query.where(name: name) if name.present?
  @results = query.order("#{sort} #{direction}").page(page).per_page(per_page)
end

So we see this duplicated elsehwere in the code base and we want to clean it up. Lets first start by extracting this out into a new class called PeopleQuery.

I usually put these under app/queries in my rails application.

people_query.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class PeopleQuery
  attr_accessor :page, :per_page, :name, :sort, :direction, :query
  def initialize(page, per_page, name, sort, direction)
    self.page = page || 1
    self.per_page = per_page || 50
    self.name = name
    self.sort = sort || 'last_name'
    self.direction = direction || 'asc'
    self.query = People
  end

  def build
    self.query = self.query.where(name: self.name) if self.name.present?
    self.query.order("#{self.sort} #{self.direction}").page(self.page).per_page(self.per_page)
  end
end

Now our controller looks like this:

dummy_controller.rb
1
2
3
4
def index
  query = PeopleQuery.new(params[:page], params[:per_page], params[:name], params[:sort], params[:direction])
  @results = query.build
end

Much better! We’ve decoupled our control from our data access object (People/ActiveRecord), moved some of the query logic outside of the controller and into a specific class meant to deal with building it. But that constructor doesn’t look very nice. We can do better since we’re using ruby.

Our new PeopleQuery class will look like this and will use a block to initialize itself instead of a long list of constructor arguments.

dummy_controller.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class PeopleQuery
  attr_accessor :page, :per_page, :name, :sort, :direction, :query
  def initialize(&block)
    yield self
    self.page ||= 1
    self.per_page =|| 50
    self.sort ||= 'last_name'
    self.direction ||= 'asc'
    self.query = People
  end

  def build
    self.query = self.query.where(name: self.name) if self.name.present?
    self.query.order("#{self.sort} #{self.direction}").page(self.page).per_page(self.per_page)
  end
end

We yield first to let the caller set the values and then after yielding we set our default values if they weren’t passed in. There is another method of doing this with instance_eval but you end up losing variable scope and the constructor looks worse since you have to start passing around the params variable to get access to it, so we’re going to stick with yield.

dummy_controller.rb
1
2
3
4
5
6
7
8
9
10
def index
  query = PeopleQuery.new do |query|
    query.page = params[:page]
    query.per_page = params[:per_page]
    query.name = params[:name]
    query.sort = params[:sort]
    query.direction = params[:direction]
  end
  @results = query.build
end

And that’s it! We’ve de-duplicated some code (remember we assumed dummy controller’s index method was duplicated elsewhere in an API call in a seperate namespaced controller), extracted out a common query object, decoupled our controller from ActiveRecord, and built up a nice way to construct the query object using the builder pattern.

Parsing HTML in Scala

Is there ever a confusing amount of information out there on parsing HTML in Scala. Here is the list of possible ways I ran across:

  • Hope the document is valid XHTML and use scala.xml.XML to parse it
  • If the document isn’t valid XHTML use something like TagSoup and hope it parses again
  • Still think its valid XHTML? Try using scalaz’s XML parser

All of the answers I found on Google pointed to some type of XML parsing, which won’t always work. Coming from Ruby I know there are tools out there like Selenium that can simulate a web browser for you and give you a rich interface to interact with the returned HTML.

So I went on Maven and found the two Selenium web drivers I wanted for my project and added them to my libraryDependencies:

1
2
"org.seleniumhq.webdriver" % "webdriver-selenium" % "0.9.7376",
"org.seleniumhq.webdriver" % "webdriver-htmlunit" % "0.9.7376"

The project I’m working on is to parse Looking Glass websites for BGP information and AS peering, so I wanted to scrape the data. I also didn’t want to have to use a full blown web browser (ala Selenium + Firefox for instance) - so I stuck with the HtmlUnit driver for the implementation.

Here is a quick code snippet that lets me grab AS #’s and Peer names from an AS:

AS.scala
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
val url = "http://example.com/AS" + as.toString

val driver = new HtmlUnitDriver
// Proxy for BetaMax when writing tests
if (_port != null) {
  driver.setProxy("localhost", _port)
}
driver.get(url)

val peers = driver.findElementsByXPath("//*[@id=\"table_peers4\"]/tbody/tr/td[position() = 1 or position() = 2]")

// zip up the list in pairs so List(a,b,c,d) becomes List((a,b), (c,d))
for(peer <- peers zip peers.tail) {
  println(peer)
}

No XML to muck with and I get some nice selectors to query the document for. Remember if the source you want data from doesn’t have an API, HTML is an API! Just be respectful of how you query and interact with them (ie: Don’t do 100 requests/second, cache/record responses while writing tests, etc).

Getting Started With Scala

Recently I’ve been getting into more Java and (attempting to) Scala development. I always got annoyed with the Scala ecosystem for development and would get fed up and just go back to writing straight Java (coughsbtcough). Today I decided to write down everything I did and get a sane process going for Scala development with SBT.

I decided to write a small Scala client for OpenWeatherMap - here is what I went through.

A brief guide on naming conventions is here. I found this useful just to reference conventions since not everything is the same as Ruby (camelCase vs snake_case for instance).

Setting up and starting a project

First make sure you hava a JVM installed, Scala, and SBT. I’ll be using Scala 2.10.2 and SBT 0.12.1 since that is what I have installed.

One of the nice things I like about Ruby on Rails is the project generation ( aka: rails new project [opts] ) so I was looking for something similar with Scala.

Enter giter8: https://github.com/n8han/giter8

giter8 runs through SBT and has templates available for quickstart.

Follow the install instructions and install giter8 into SBT globally and load SBT to make sure it downloads and installs.

Once you do that you can pick a template from the list, or go with the one I chose: fayimora/basic-scala-project which sets up the directories properly and also sets up ScalaTest, a testing framework with a DSL similar to RSpec.

To setup your project you need to run:

1
g8 fayimora/basic-scala-project

You’ll be prompted with several questions and then your project will be made. Switch into that directory and run sbt test to make sure the simple HelloWorld passes and everything with SBT is working.

Setting up IntelliJ

For Java and Scala projects I stick with IntelliJ over my usual vim. When using Java IntelliJ is good about picking up library and class path’s and resolving dependencies (especially if you are using Maven). However there isn’t a good SBT plugin (as of writing this) that manages to do all this inside IntelliJ.

The best plugin for SBT I’ve found that does this is sbt-idea. You’re going to need to make a project/plugins.sbt file:

plugins.sbt
1
addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.5.2")

and now you can generate your .idea files by running: sbt gen-idea

IntelliJ should now resolve your project dependencies and you can start coding your project.

Final Result

scala-weather - A simple to use OpenWeatherMap client in Scala set up with Travis-CI and CodeClimate. This is just the first of several projects I plan on working on / open sourcing to get my feet wet with Scala more.

Useful libraries

Notes

By default Bee Client will log everything to STDOUT - you’ll need to configure logback with an XML file located in src/main/resources/logback.xml:

src/main/resources/logback/xml
1
2
3
4
5
6
7
8
9
10
11
12
<configuration>

    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <root level="ERROR">
        <appender-ref ref="STDOUT" />
    </root>
</configuration>

From 0 to Testing on Windows With JRuby

Testing is one of the most important parts of software development and helps to ensure bugs don’t get into production and that code can be refactored safely. If you’re working on a team with multiple people with different skill sets, you might have people doing testing who only know windows and development is only using OSX or Linux. We want everyone to be able to test - someone in QA who is familiar with Windows shouldn’t have to throw away all that knowledge, install Linux, and start from scratch. Enter JRuby and John.

John is our tester and he is running windows. He wants to help make sure that when a user goes to http://google.com/ that a button appears with the text “Google Search”. The quick way to do this is to open his browser, navigate to http://google.com/ glance through the page for the button and confirm that its there. John has a problem though, he has 30 other test cases to run and the developers are pushing code to the frontpage several times a day; John now has to continously do this manually everytime code is touched and his test load is piling up.

So let’s help John out and install Sublime Text 2 and JRuby.

Start by downloading the 64-bit version of Sublime Text. Make sure to add the context menu when going through the install process.

Now we’ll visit the JRuby homepage and download the 64 bit installer.

Go through the installer and let JRuby set your path so you can access ruby from cmd.exe

Now when we open cmd.exe and type jruby -v we’ll be able to see that it was installed.

Now that we have our tools installed lets setup our test directory on the Desktop. Inside our testing folder we’ll create a folder called TestDemo for our tests for the Demo project.

Next we’ll open Sublime Text and go to File > Open Folder and navigate to our TestDemo folder and hit open.

Now we can continue making our directory structure inside Sublime Text. Since we’re going to use rspec we need to create a folder called spec to contain all of our tests. Right click on the TestDemo in the tree navigation and click New Folder.

Call the folder spec in the bottom title bar when it prompts you for the folder name.

Next we’ll create our Gemfile which will declare all of our dependencies - so make a file in the project root called Gemfile and put the our dependencies in it:

Gemfile
1
2
3
4
5
6
source "https://rubygems.org"

gem "rspec"
gem "selenium"
gem "selenium-webdriver"
gem "capybara"

Once we have that file created, open cmd.exe and switch to your project’s root directory.

Type jgem install bundler to install bundler which manages ruby dependencies.

While still at the command prompt we’re going to bundle to install our dependencies:

After that finishes we need to run one last command for selenium to work properly: selenium install

We also need a spec_helper.rb file inside our spec directory.

specs\spec_helper.rb
1
2
3
4
5
require "rspec"
require "selenium"
require "capybara/rspec"

Capybara.default_driver =  :selenium

We’ve now setup our rspec folders, our Gemfile with dependencies, and installed them. Now we can write the test that will save John a ton of time.

Chrome comes with a simple tool to get XPath paths so we’re going to use that to get the XPath for the search button. Right click on the “Google Search” button and click Inspect element

Right click on the highlighted element and hit Copy XPath.

Now we’re going to make our spec file and call it homepage_spec.rb and locate it under spec\integration.

Here is a picture showing the directory structure and files:

Here is the spec file with comments explaining each part:

spec\integration\homepage_spec.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# This loads the spec helper file that we required everything in
require "spec_helper"

# This is the outer level description of the test
# For this example it describes going to the homepage of Google.com
# Setting the feature type is necessary if you have
# Capybara specs outside of the spec\features folder
describe "Going to google.com", :type => :feature do

  # Context is like testing a specific component of the homepage, in this case
  # its the search button
  context "The search button" do
    # This is our actual test where we give it a meaningful test description
    it "should contain the text 'Google Search'" do
      visit "http://google.com/" # Opens Firefox and visits google
      button = find(:xpath, '//*[@id=gbqfba"') # find an object on the page by its XPath path
      # This uses an rspec assertion saying that the string returned
      # by button.text is equal to "Google Search"
      button.text.should eq("Google Seearch")

    end
  end

end

Now we can tab back to our cmd.exe prompt and run our tests! rspec spec will run all your tests under the spec folder.

Things to take note of

This example scenario is showing how to automate browser testing to do end-to-end tests on a product using rspec. This is by no means everything you can do with rspec and ruby - you can SSH, hit APIs and parse JSON, and do anything you want with the ability to make assertions.

A lot is going on in these examples - there are plenty of resources out there on google and other websites that provide more rspec examples and ruby examples.

We also showed how to add dependencies and install them using bundler. Two of the best resources for finding libraries and other gems is RubyGems and Ruby-Toolbox - the only thing to take note of is anything saying to be a native C extension (they won’t work with JRuby out of the box).

My last note is that you also need to have firefox installed as well - Selenium will work with Chrome but I’ve found it to be a hassle to setup (and unless you really need Chrome), the default of Firefox will work great.

A Simple Ruby Plugin System

Let’s start out with a simple directory structure:

1
2
3
4
5
6
7
8
.
├── plugin.rb
├── main.rb
└── plugins
    ├── cat.rb
    └── dog.rb

1 directory, 3 files

All the plugins we will use for our library will be loaded from plugins. Now lets make a simple Plugin class and register our plugins.

plugin.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Plugin
  # Keep the plugin list inside a set so we don't double-load plugins
  @plugins = Set.new

  def self.plugins
    @plugins
  end

  def self.register_plugins
    # Iterate over each symbol in the object space
    Object.constants.each do |klass|
      # Get the constant from the Kernel using the symbol
      const = Kernel.const_get(klass)
      # Check if the plugin has a super class and if the type is Plugin
      if const.respond_to?(:superclass) and const.superclass == Plugin
        @plugins << const
      end
    end
  end
end

We’ve now made a simple class that will contain all of our plugin data when we call register_plugins.

Now for our Dog and Cat classes:

dog.rb
1
2
3
4
5
6
7
class DogPlugin < Plugin

  def handle_command(cmd)
    p "Command received #{cmd}"
  end

end
cat.rb
1
2
3
4
5
6
7
class CatPlugin < Plugin

  def handle_command(cmd)
    p "Command received #{cmd}"
  end

end

Now combine this all together in one main entry point and we have a simple plugin system that lets us send messages to each plugin through a set method ( handle_command ).

main.rb
1
2
3
4
5
6
7
8
require './plugin'
Dir["./plugins/*.rb"].each { |f| require f }
Plugin.register_plugins

# Test that we can send a message to each plugin
Plugin.plugins.each do |plugin|
  plugin.handle_command('test')
end

This is a very simple but useful way to make a plugin system to componentize projects like a chat bot for IRC.

Why Setuid Is Bad and What You Can Do

Why setuid is Bad

setuid allows a binary to be run as a different user then the one invoking it. For example, ping needs to use low level system interfaces (socket, PF_INET, SOCK_RAW, etc) in order to function properly. We can watch this in action by starting ping in another terminal window ( ping google.com ) and then using strace to see the syscall’s being made:

sudo strace -p PID and we get the following:

strace output
1
2
3
munmap(0x7f329e7ea000, 4096)            = 0stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=185, ...}) = 0
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, 16) = 0

We can find all setuid programs installed by issuing the command:

How to find all setuid programs
1
sudo find / -xdev \( -perm -4000 \) -type f -print0 -exec ls -l {} \;

This will find all commands that have the root setuid bit set in their permission bit.

setuid list for a few popular operating systems:

Of particular interest in OpenBSD, where a lot of work was done to remove and switch programs from needing to use setuid/gid permissions. OpenIndiana is the worst offender and has the widest vector for attack.

setuid escalation is a common attack vector and can allow unprivileged code to be executed by a regular user, and then escalate itself to root and drop you in on the root shell.

Here are a few examples:

CVE-2012-0056: Exploiting /proc/pid/mem

http://blog.zx2c4.com/749 - C code that uses a bug in the way the Linux kernel checked permissions on /proc/pid/mem and then uses that to exploit the su binary to give a root shell.

CVE-2010-3847: Exploiting via $ORIGIN and file descriptors

http://www.exploit-db.com/exploits/15274/ - By exploiting a hole in the way the $ORIGIN is checked, a symlink can be made to a program that uses setuid and exec‘d ‘to obtain the file descriptors which then lets arbitrary code injection (in this case a call to system("/bin/bash")).

More of these can be found at http://www.exploit-db.com/shellcode/ and just searching google for setuid exploits.

So you may not want to completely disable the setuid flag on all the binaries for your distribution, but we can turn on some logging to watch when they’re getting called and install a kernel patch that will secure the OS and help prevent 0-days that may prey on setuid vulnerabilities.

How to log setuid calls

I will detail the steps to do this on Ubuntu, but they should apply to the other audit daemons on CentOS.

Let’s first install auditd: sudo apt-get install auditd

Let’s open up /etc/audit/audit.rules, and with a few tweaks with vim, we can insert the list we generated with find into the audit rule set (explanation of each flag after the jump):

/etc/audit/audit.rules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# This file contains the auditctl rules that are loaded# whenever the audit daemon is started via the initscripts.
# The rules are simply the parameters that would be passed
# to auditctl.

# First rule - delete all
-D

# Increase the buffers to survive stress events.
# Make this bigger for busy systems
-b 320

# Feel free to add below this line. See auditctl man page

-a always,exit -F path=/usr/lib/pt_chown -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/lib/eject/dmcrypt-get-device -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/lib/dbus-1.0/dbus-daemon-launch-helper -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/lib/openssh/ssh-keysign -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/sbin/uuidd -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/sbin/pppd -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/at -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/passwd -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/mtr -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/sudoedit -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/traceroute6.iputils -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/chsh -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/sudo -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/chfn -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/gpasswd -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/usr/bin/newgrp -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/bin/fusermount -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/bin/umount -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/bin/ping -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/bin/ping6 -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/bin/su -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
-a always,exit -F path=/bin/mount -F perm=x -F auid>=500 -F auid!=4294967295 -k privileged
audtid Options Explained
1
2
3
4
5
6
7
-a: appends the always, and exit rules. This says to always make a log at syscall entry and syscall exit.
-F
     path= says filter to the executable being called
     perm=x says filter on the program being executable
     auid>= says log all calls for users who have a UID above 500 (regular user accounts start at 1000 generally)
     auid!=4294967295 sometimes a process may start before the auditd, in which case it will get a auid of 4294967295
-k passes a filter key that will be put into the record log, in this case its "privileged"

So now when we run ping google.com we can see a full audit trail in /var/log/audit/audit.log:

auditd output
1
2
3
4
5
type=SYSCALL msg=audit(1361852594.621:48): arch=c000003e syscall=59 success=yes exit=0 a0=f43de8 a1=d40488 a2=ed8008 a3=7fffc9c9a150 items=2 ppid=1464 pid=1631 auid=1000 uid=1000 gid=1000 euid=0 suid=0 fsuid=0 egid=1000 sgid=1000 fsgid=1000 tty=pts1 ses=6 comm="ping" exe="/bin/ping" key="privileged"type=EXECVE msg=audit(1361852594.621:48): argc=2 a0="ping" a1="google.com"
type=BPRM_FCAPS msg=audit(1361852594.621:48): fver=0 fp=0000000000000000 fi=0000000000000000 fe=0 old_pp=0000000000000000 old_pi=0000000000000000 old_pe=0000000000000000 new_pp=ffffffffffffffff new_pi=0000000000000000 new_pe=ffffffffffffffff
type=CWD msg=audit(1361852594.621:48):  cwd="/home/ubuntu"
type=PATH msg=audit(1361852594.621:48): item=0 name="/bin/ping" inode=131711 dev=08:01 mode=0104755 ouid=0 ogid=0 rdev=00:00
type=PATH msg=audit(1361852594.621:48): item=1 name=(null) inode=934 dev=08:01 mode=0100755 ouid=0 ogid=0 rdev=00:00

Next steps: Patching and upgrading the kernel with GRSecurity

GRSecurity is an awesome tool in the security-minded system administrators toolbag. It will prevent zero days (like the proc mem exploit explained above 1 ) by securing which areas a user can access. A full list can be seen at http://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options and http://en.wikipedia.org/wiki/Grsecurity#Miscellaneous_features, I suggest going through these and seeing if you want to continue with this.

The following below is for advanced users. Not responsible for any issues you may run into, please make sure to test this in a staging/test environment.

Here are the steps I followed to install the patch:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Start by downloading the latest kernel
wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.2.39.tar.bz2

# Next extract it
tar xjvf linux-3.2.39.tar.bz2
cd linux-3.2.39

# Copy over your current kernel configuration:
cp -vi /boot/config-`uname -r` .config

# Updates the config file to match old config and prompts for any new kernel options.
make oldconfig

# This will make sure only modules get compiled only if they are in your kernel. 
make localmodconfig

# Bring up the configuration menu
make menuconfig

Once your in the menu config you can browse to the Security section and go to Grsecurity and enable it. I set the configuration method to automatic and then went to Customize. For example, you can now go to Kernel Auditing -> Exec logging to turn on some additional logging to shell activities (WARNING: this will generate a lot of log activity, decide if you want to use this or not). I suggest going through all of these and reading through their menu help descriptions (when selecting one, press the ? key to bring up the help).

Now we’ll finish making the kernel and compiling it:

1
2
3
4
5
6
7
8
# Now we can compile the kernel
make -j2 # where 2 is the # of CPU's + 1

# Install and load the dynamic kernel modules
sudo make modules_install

# Finally install kernel
sudo make install

We can now reboot and boot into our GRsecurity patched kernel!

Hopefully this article has provided some insight into what the setuid flag does, how it has and can be exploited, and what we can do to prevent this in the future.

Here are a few links to useful books on the subject of shellcode and exploits that I reccomend:

Below is the list of setuid binaries on each OS

Ubuntu 12.04 LTS (22)

back to top

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-rwsr-xr-x 1 root    root        31304 Mar  2  2012 /bin/fusermount-rwsr-xr-x 1 root    root        94792 Mar 30  2012 /bin/mount
-rwsr-xr-x 1 root    root        35712 Nov  8  2011 /bin/ping
-rwsr-xr-x 1 root    root        40256 Nov  8  2011 /bin/ping6
-rwsr-xr-x 1 root    root        36832 Sep 12 18:29 /bin/su
-rwsr-xr-x 1 root    root        69096 Mar 30  2012 /bin/umount
-rwsr-sr-x 1 daemon  daemon      47928 Oct 25  2011 /usr/bin/at
-rwsr-xr-x 1 root    root        41832 Sep 12 18:29 /usr/bin/chfn
-rwsr-xr-x 1 root    root        37096 Sep 12 18:29 /usr/bin/chsh
-rwsr-xr-x 1 root    root        63848 Sep 12 18:29 /usr/bin/gpasswd
-rwsr-xr-x 1 root    root        62400 Jul 28  2011 /usr/bin/mtr
-rwsr-xr-x 1 root    root        32352 Sep 12 18:29 /usr/bin/newgrp
-rwsr-xr-x 1 root    root        42824 Sep 12 18:29 /usr/bin/passwd
-rwsr-xr-x 2 root    root        71288 May 31  2012 /usr/bin/sudo
-rwsr-xr-x 2 root    root        71288 May 31  2012 /usr/bin/sudoedit
-rwsr-xr-x 1 root    root        18912 Nov  8  2011 /usr/bin/traceroute6.iputils
-rwsr-xr-- 1 root    messagebus 292944 Oct  3 13:03 /usr/lib/dbus-1.0/dbus-daemon-launch-helper
-rwsr-xr-x 1 root    root        10408 Dec 13  2011 /usr/lib/eject/dmcrypt-get-device
-rwsr-xr-x 1 root    root       240984 Apr  2  2012 /usr/lib/openssh/ssh-keysign
-rwsr-xr-x 1 root    root        10592 Oct  5 16:08 /usr/lib/pt_chown
-rwsr-xr-- 1 root    dip        325744 Feb  4  2011 /usr/sbin/pppd
-rwsr-sr-x 1 libuuid libuuid     18856 Mar 30  2012 /usr/sbin/uuidd

CentOS 6.3 (21)

back to top

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-rwsr-xr-x. 1 root root  76056 Nov  5 05:21 /bin/mount-rwsr-xr-x. 1 root root  40760 Jul 19  2011 /bin/ping
-rwsr-xr-x. 1 root root  36488 Jul 19  2011 /bin/ping6
-rwsr-xr-x. 1 root root  34904 Jun 22  2012 /bin/su
-rwsr-xr-x. 1 root root  50496 Nov  5 05:21 /bin/umount
-rwsr-x---. 1 root dbus  46232 Sep 13 13:04 /lib64/dbus-1/dbus-daemon-launch-helper
-rwsr-xr-x. 1 root root  10272 Apr 16  2012 /sbin/pam_timestamp_check
-rwsr-xr-x. 1 root root  34840 Apr 16  2012 /sbin/unix_chkpwd
-rwsr-xr-x. 1 root root  54240 Jan 30  2012 /usr/bin/at
-rwsr-xr-x. 1 root root  66352 Dec  7  2011 /usr/bin/chage
-rws--x--x. 1 root root  20184 Nov  5 05:21 /usr/bin/chfn
-rws--x--x. 1 root root  20056 Nov  5 05:21 /usr/bin/chsh
-rwsr-xr-x. 1 root root  47520 Jul 19  2011 /usr/bin/crontab
-rwsr-xr-x. 1 root root  71480 Dec  7  2011 /usr/bin/gpasswd
-rwsr-xr-x. 1 root root  36144 Dec  7  2011 /usr/bin/newgrp
-rwsr-xr-x. 1 root root  30768 Feb 22  2012 /usr/bin/passwd
---s--x--x. 2 root root 219272 Aug  6  2012 /usr/bin/sudo
---s--x--x. 2 root root 219272 Aug  6  2012 /usr/bin/sudoedit
-rwsr-xr-x. 1 root root 224912 Nov  9 07:49 /usr/libexec/openssh/ssh-keysign
-rws--x--x. 1 root root  14280 Jan 31 06:30 /usr/libexec/pt_chown
-rwsr-xr-x. 1 root root   9000 Sep 17 05:55 /usr/sbin/usernetctl

OpenBSD 5.2 (3)

back to top

1
2
-r-sr-xr-x  1 root  bin       242808 Aug  1  2012 /sbin/ping-r-sr-xr-x  1 root  bin       263288 Aug  1  2012 /sbin/ping6
-r-sr-x---  1 root  operator  222328 Aug  1  2012 /sbin/shutdown

OpenIndiana 11 (53)

back to top

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
-rwsr-xr-x   1 root     bin        64232 Jun 30  2012 /sbin/wificonfig--wS--lr-x   1 root     root           0 Dec 11 15:20 /media/.hal-mtab-lock
-r-sr-xr-x   1 root     bin       206316 Dec 11 21:00 /usr/lib/ssh/ssh-keysign
-rwsr-xr-x   1 root     adm        12140 Jun 30  2012 /usr/lib/acct/accton
-r-sr-xr-x   1 root     bin        23200 Jun 30  2012 /usr/lib/fs/ufs/quota
-r-sr-xr-x   1 root     bin       111468 Jun 30  2012 /usr/lib/fs/ufs/ufsrestore
-r-sr-xr-x   1 root     bin       106964 Jun 30  2012 /usr/lib/fs/ufs/ufsdump
-r-sr-xr-x   1 root     bin        18032 Jun 30  2012 /usr/lib/fs/smbfs/umount
-r-sr-xr-x   1 root     bin        18956 Jun 30  2012 /usr/lib/fs/smbfs/mount
-r-sr-xr-x   1 root     bin        12896 Jun 30  2012 /usr/lib/utmp_update
-r-sr-xr-x   1 root     bin        35212 Jun 30  2012 /usr/bin/fdformat
-r-s--x--x   2 root     bin       188080 Jun 30  2012 /usr/bin/sudoedit
-r-sr-xr-x   1 root     sys        34876 Jun 30  2012 /usr/bin/su
-r-sr-xr-x   1 root     bin        42504 Jun 30  2012 /usr/bin/login
-r-sr-xr-x   1 root     bin       257288 Jun 30  2012 /usr/bin/pppd
-r-sr-xr-x   1 root     sys        46208 Jun 30  2012 /usr/bin/chkey
-r-sr-xr-x   1 root     sys        29528 Jun 30  2012 /usr/bin/amd64/newtask
-r-sr-xr-x   2 root     bin        24432 Jun 30  2012 /usr/bin/amd64/w
-r-sr-xr-x   1 root     bin      3224200 Jun 30  2012 /usr/bin/amd64/Xorg
-r-sr-xr-x   2 root     bin        24432 Jun 30  2012 /usr/bin/amd64/uptime
-rwsr-xr-x   1 root     sys        47804 Jun 30  2012 /usr/bin/at
-r-sr-xr-x   1 root     bin         8028 Jun 30  2012 /usr/bin/mailq
-r-sr-xr-x   1 root     bin        33496 Jun 30  2012 /usr/bin/rsh
-r-sr-xr-x   1 root     bin        68704 Jun 30  2012 /usr/bin/rmformat
-r-sr-sr-x   1 root     sys        31292 Jun 30  2012 /usr/bin/passwd
-rwsr-xr-x   1 root     sys        23328 Jun 30  2012 /usr/bin/atrm
-r-sr-xr-x   1 root     bin        97072 Jun 30  2012 /usr/bin/xlock
-r-sr-xr-x   1 root     bin        78672 Jun 30  2012 /usr/bin/rdist
-r-sr-xr-x   1 root     bin        27072 Jun 30  2012 /usr/bin/sys-suspend
-r-sr-xr-x   1 root     bin        29304 Jun 30  2012 /usr/bin/crontab
-r-sr-xr-x   1 root     bin        53080 Jun 30  2012 /usr/bin/rcp
-r-s--x--x   2 root     bin       188080 Jun 30  2012 /usr/bin/sudo
-r-s--x--x   1 uucp     bin        70624 Jun 30  2012 /usr/bin/tip
-rwsr-xr-x   1 root     sys        18824 Jun 30  2012 /usr/bin/atq
-r-sr-xr-x   1 root     bin       281732 Jun 30  2012 /usr/bin/xscreensaver
-r-sr-xr-x   1 root     bin      2767780 Jun 30  2012 /usr/bin/i86/Xorg
-r-sr-xr-x   1 root     sys        22716 Jun 30  2012 /usr/bin/i86/newtask
-r-sr-xr-x   2 root     bin        22020 Jun 30  2012 /usr/bin/i86/w
-r-sr-xr-x   2 root     bin        22020 Jun 30  2012 /usr/bin/i86/uptime
-rwsr-xr-x   1 root     sys        13636 Jun 30  2012 /usr/bin/newgrp
-r-sr-xr-x   1 root     bin        39224 Jun 30  2012 /usr/bin/rlogin
-rwsr-xr-x   1 svctag   daemon    108964 Jun 30  2012 /usr/bin/stclient
-r-sr-xr-x   1 root     bin        29324 Jun 30  2012 /usr/xpg4/bin/crontab
-rwsr-xr-x   1 root     sys        47912 Jun 30  2012 /usr/xpg4/bin/at
-r-sr-xr-x   3 root     bin        41276 Jun 30  2012 /usr/sbin/deallocate
-rwsr-xr-x   1 root     sys        32828 Jun 30  2012 /usr/sbin/sacadm
-r-sr-xr-x   1 root     bin        46512 Jun 30  2012 /usr/sbin/traceroute
-r-sr-xr-x   1 root     bin        18016 Jun 30  2012 /usr/sbin/i86/whodo
-r-sr-xr-x   1 root     bin        55584 Jun 30  2012 /usr/sbin/ping
-r-sr-xr-x   3 root     bin        41276 Jun 30  2012 /usr/sbin/allocate
-r-sr-xr-x   1 root     bin        37320 Jun 30  2012 /usr/sbin/pmconfig
-r-sr-xr-x   3 root     bin        41276 Jun 30  2012 /usr/sbin/list_devices
-r-sr-xr-x   1 root     bin        24520 Jun 30  2012 /usr/sbin/amd64/whodo

Securing Ubuntu

Table of Contents

Initial Setup

Setting up iptables and Fail2Ban

Fail2Ban
iptables rules

Make shared memory read-only

Setting up Bastille Linux

Configuring Bastille

sysctl hardening

Setting up a chroot environment

Securing nginx inside the chroot

Extras

Initial Setup

Let’s login to our new machine and take some initial steps to secure our system. For this article I’m going to assume your username is ubuntu.

If you need to, setup your sudoers file by adding the following lines:

/etc/sudoers
1
ubuntu ALL=(ALL:ALL) ALL # put this in the "User privilege specification" section

Edit your ~/.ssh/authorized_keys and put your public key inside it. Make sure you can login without a password now once your key is in place.

Open up /etc/ssh/sshd_config and make sure these lines exist to secure SSH:

/etc/ssh/sshd_config
1
2
3
4
5
6
7
8
9
10
# Only allow version 2 communications, version 1 has known vulnerabilities
Protocol 2
# Disable root login over ssh
PermitRootLogin no
# Load authorized keys files from a users home directory
AuthorizedKeysFile  %h/.ssh/authorized_keys
# Don't allow empty passwords to be used to authenticate
PermitEmptyPasswords no
# Disable password auth, you must use ssh keys
PasswordAuthentication no

Keep your current session open and restart sshd:

/etc/ssh/sshd_config
1
sudo service ssh restart

Make sure you can login from another terminal. If you can, move on.

Now we need to update and upgrade to make sure all of our packages are up to date and install two pre-requisites for later in the article: build-essential and ntp.

apt
1
2
3
4
sudo apt-get update
sudo apt-get install build-essential ntp
sudo apt-get upgrade
sudo reboot

Setting up iptables and Fail2Ban

Fail2Ban

apt
1
sudo apt-get install fail2ban

Open up the fail2ban config and change the ban time, destemail, and maxretry:

/etc/fail2ban/jail.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[DEFAULT]
ignoreip = 127.0.0.1/8
bantime  = 3600
maxretry = 2
destemail = ubuntu@yourdomain.com
action = %(action_mw)s

[ssh]

enabled  = true
port     = ssh
filter   = sshd
logpath  = /var/log/auth.log
maxretry = 2

Now restart fail2ban.

/etc/fail2ban/jail.conf
1
sudo service fail2ban restart

If you try and login from another machine and fail, you should see the ip in iptables.

/etc/fail2ban/jail.conf
1
2
3
4
5
# sudo iptables -L
Chain fail2ban-ssh (1 references)
target     prot opt source               destination
DROP       all  --  li203-XX.members.linode.com  anywhere
RETURN     all  --  anywhere             anywhere

iptables Rules

Here are my default iptables rules, it opens up port 80 and 443 for HTTP/HTTPS communication, and allows port 22. We also allow ping and then log all denied calls and then reject everything else. If you have other services you need to run, such as a game server or something else, you’ll have to add the rules to open up the ports in the iptables config.

/etc/iptables.up.rules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
*filter

# Accepts all established inbound connections
 -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Allows all outbound traffic
# You could modify this to only allow certain traffic
 -A OUTPUT -j ACCEPT

# Allows HTTP and HTTPS connections from anywhere (the normal ports for websites)
 -A INPUT -p tcp --dport 443 -j ACCEPT
 -A INPUT -p tcp --dport 80 -j ACCEPT
# Allows SSH connections for script kiddies
# THE -dport NUMBER IS THE SAME ONE YOU SET UP IN THE SSHD_CONFIG FILE
 -A INPUT -p tcp -m state --state NEW --dport 22 -j ACCEPT

# Now you should read up on iptables rules and consider whether ssh access
# for everyone is really desired. Most likely you will only allow access from certain IPs.

# Allow ping
 -A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT

# log iptables denied calls (access via 'dmesg' command)
 -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied: " --log-level 7

# Reject all other inbound - default deny unless explicitly allowed policy:
 -A INPUT -j REJECT
 -A FORWARD -j REJECT

COMMIT

We can load that up into iptables:

1
sudo iptables-restore < /etc/iptables.up.rules

Make sure it loads on boot by putting it into the if-up scripts:

/etc/network/if-up.d/iptables
1
2
#!/bin/sh
iptables-restore /etc/iptables.up.rules

Now make it executable:

1
chmod +x /etc/network/if-up.d/iptables

Rebooting here is optional, I usually reboot after major changes to make sure everything boots up properly.

If you’re getting hit by scanners or brute-force attacks, you’ll see a line similar to this in your /var/log/syslog:

1
Jan 18 03:30:37 localhost kernel: [   79.631680] iptables denied: IN=eth0 OUT= MAC=04:01:01:40:70:01:00:12:f2:c6:e8:00:08:00 SRC=87.13.110.30 DST=192.34.XX.XX LEN=64 TOS=0x00 PREC=0x00 TTL=34 ID=57021 DF PROTO=TCP SPT=1253 DPT=135 WINDOW=53760 RES=0x00 SYN URGP=0

Read only shared memory

A common exploit vector is going through shared memory (which can let you change the UID of running programs and other malicious actions). It can also be used as a place to drop files once an initial breakin has been made. An example of one such exploit is available here.

Open /etc/fstab/:

/etc/fstab
1
tmpfs     /dev/shm     tmpfs     defaults,ro     0     0

Once you do this you need to reboot.

Setting up Bastille Linux

The Bastille Hardening program “locks down” an operating system, proactively configuring the system for increased security and decreasing its susceptibility to compromise. Bastille can also assess a system’s current state of hardening, granularly reporting on each of the security settings with which it works.

Bastille: Installation and Setup
1
2
3
sudo apt-get install bastille # choose Internet site for postfix
# configure bastille
sudo bastille

After you run that command you’ll be prompted to configure your system, here are the options I chose:

Configuring Bastille

  • File permissions module: Yes (suid)
  • Disable SUID for mount/umount: Yes
  • Disable SUID on ping: Yes
  • Disable clear-text r-protocols that use IP-based authentication? Yes
  • Enforce password aging? No (situation dependent, I have no users accessing my machines except me, and I only allow ssh keys)
  • Default umask: Yes
  • Umask: 077
  • Disable root login on tty’s 1-6: No
  • Password protect GRUB prompt: No (situation dependent, I’m on a VPS and would like to get support in case I need it)
  • Password protect su mode: Yes
  • default-deny on tcp-wrappers and xinetd? No
  • Ensure telnet doesn’t run? Yes
  • Ensure FTP does not run? Yes
  • display authorized use message? No (situation dependent, if you had other users, Yes)
  • Put limits on system resource usage? Yes
  • Restrict console access to group of users? Yes (then choose root)
  • Add additional logging? Yes
  • Setup remote logging if you have a remote log host, I don’t so I answered No
  • Setup process accounting? Yes
  • Disable acpid? Yes
  • Deactivate nfs + samba? Yes (situation dependent)
  • Stop sendmail from running in daemon mode? No (I have this firewalled off, so I’m not concerned)
  • Deactivate apache? Yes
  • Disable printing? Yes
  • TMPDIR/TMP scripts? No (if a multi-user system, yes)
  • Packet filtering script? No (we configured the firewall previously)
  • Finished? YES! & reboot

You can verify some of these changes by testing them out, for instance, the SUID change on ping:

Bastille: Verifying changes
1
2
3
4
5
6
7
8
9
ubuntu@app1:~$ ping google.com
ping: icmp open socket: Operation not permitted
ubuntu@app1:~$ sudo ping google.com
PING google.com (74.125.228.72) 56(84) bytes of data.
64 bytes from iad23s07-in-f8.1e100.net (74.125.228.72): icmp_req=1 ttl=55 time=9.06 ms
^C
--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 9.067/9.067/9.067/0.000 ms

Sysctl hardening

Since our machine isn’t running as a router and is going to be running as an application/web server, there are additional steps we can take to secure the machine. Many of these are from the NSA’s security guide, which you can read in its entirety here.

/etc/sysctl.conf Source
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# Protect ICMP attacks
net.ipv4.icmp_echo_ignore_broadcasts = 1

# Turn on protection for bad icmp error messages
net.ipv4.icmp_ignore_bogus_error_responses = 1

# Turn on syncookies for SYN flood attack protection
net.ipv4.tcp_syncookies = 1

# Log suspcicious packets, such as spoofed, source-routed, and redirect
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1

# Disables these ipv4 features, not very legitimate uses
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0

# Enables RFC-reccomended source validation (dont use on a router)
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1

# Make sure no one can alter the routing tables
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0

# Host only (we're not a router)
net.ipv4.ip_forward = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0


# Turn on execshild
kernel.exec-shield = 1
kernel.randomize_va_space = 1

# Tune IPv6
net.ipv6.conf.default.router_solicitations = 0
net.ipv6.conf.default.accept_ra_rtr_pref = 0
net.ipv6.conf.default.accept_ra_pinfo = 0
net.ipv6.conf.default.accept_ra_defrtr = 0
net.ipv6.conf.default.autoconf = 0
net.ipv6.conf.default.dad_transmits = 0
net.ipv6.conf.default.max_addresses = 1

# Optimization for port usefor LBs
# Increase system file descriptor limit
fs.file-max = 65535

# Allow for more PIDs (to reduce rollover problems); may break some programs 32768
kernel.pid_max = 65536

# Increase system IP port limits
net.ipv4.ip_local_port_range = 2000 65000

# Increase TCP max buffer size setable using setsockopt()
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608

# Increase Linux auto tuning TCP buffer limits
# min, default, and max number of bytes to use
# set max to at least 4MB, or higher if you use very high BDP paths
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_window_scaling = 1

After making these changes you should reboot.

Setting up a chroot environment

We’ll be setting up a chroot environment to run our web server and applications in. Chroot’s provide isolation from the rest of the operating system, so even in the event of a application compromise, damage can be mitigated.

chroot: Installation and Setup
1
sudo apt-get install debootstrap dchroot

Now add this to your /etc/schroot/schroot.conf file, precise is the release of Ubuntu I’m using, so change it if you need to:

/etc/schroot/schroot.conf
1
2
3
4
5
6
7
[precise]
description=Ubuntu Precise LTS
location=/var/chroot
priority=3
users=ubuntu
groups=sbuild
root-groups=root

Now bootstrap the chroot with a minimal Ubuntu installation:

1
2
3
4
5
6
sudo debootstrap --variant=buildd --arch amd64 precise /var/chroot/ http://mirror.anl.gov/pub/ubuntu/
sudo cp /etc/resolv.conf /var/chroot/etc/resolv.conf
sudo mount -o bind /proc /var/chroot/proc
sudo chroot /var/chroot/
apt-get install ubuntu-minimal
apt-get update

Add the following to /etc/apt/sources.list inside the chroot:

1
2
3
4
5
deb http://archive.ubuntu.com/ubuntu precise main
deb http://archive.ubuntu.com/ubuntu precise-updates main
deb http://security.ubuntu.com/ubuntu precise-security main
deb http://archive.ubuntu.com/ubuntu precise universe
deb http://archive.ubuntu.com/ubuntu precise-updates universe

Let’s test out our chroot and install nginx inside of it:

1
2
apt-get update
apt-get install nginx

Securing nginx inside the chroot

First thing we will do is add a www user for nginx to run under:

Adding a application user
1
2
3
4
sudo chroot /var/chroot
useradd www -d /home/www
mkdir /home/www
chown -R www.www /home/www

Open up /etc/nginx/nginx.conf and make sure you change user to www inside the chroot:

/etc/nginx/nginx.conf
1
user www;

We can now start nginx inside the chroot:

1
2
sudo chroot /var/chroot
service nginx start

Now if you go to http://your_vm_ip/ you should see “Welcome to nginx!” running inside your fancy new chroot.

We also need to setup ssh to run inside the chroot so we can deploy our applications more easily.

Chroot: sshd
1
2
sudo chroot /var/chroot
apt-get install openssh-server udev

Since we already have SSH for the main host running on 22, we’re going to run SSH for the chroot on port 2222. We’ll copy over our config from outside the chroot to the chroot.

sshd config
1
sudo cp /etc/ssh/sshd_config /var/chroot/etc/ssh/sshd_config

Now open the config and change the bind port to 2222.

We also need to add the rules to our firewall script:

/etc/iptables.up.rules
1
2
# Chroot ssh
 -A INPUT -p tcp -m state --state NEW --dport 2222 -j ACCEPT

Now make a startup script for chroot-precise in `/etc/init.d/chroot-precise:

/etc/init.d/chroot-precise
1
2
3
4
5
6
mount -o bind /proc /var/chroot/proc
mount -o bind /dev /var/chroot/dev
mount -o bind /sys /var/chroot/sys
mount -o bind /dev/pts /var/chroot/dev/pts
chroot /var/chroot service nginx start
chroot /var/chroot service ssh start

Set it to executable and to start at boot:

1
2
sudo chmod +x /etc/init.d/chroot-precise
sudo update-rc.d chroot-precise defaults

Next is to put your public key inside the .ssh/authorized_keys file for the www user inside the chroot so you can ssh and deploy your applications.

If you want, you can test your server and reboot it now to ensure nginx and ssh boot up properly. If it’s not running right now, you start it: sudo /etc/init.d/chroot-precise.

You should now be able to ssh into your chroot and main server without a password.

Extras

I would like to also mention the GRSecurity kernel patch. I had tried several times to install this (two different versions were released while I was writing this) and both make the kernel unable to compile. Hopefully they’ll fix these bugs and I’ll be able to update this article with notes on setting GRSecurity up as well.

I hope this article proved useful to anyone trying to secure a Ubuntu system, and if you liked it please share it!

Rb RFO Status: A Simple System Status Page in Ruby

Rb RFO Status is a simple system to post status updates to your team or customers in a easy to understand format so there is no delay in reporting a reason for outage. It is modeled slightly after the Heroku Status Page.

Source: https://github.com/bluescripts/rb_rfo_status

Download: https://s3.amazonaws.com/josh-opensource/rb_rfo_status-0.1.war

It is licensed under the MIT License so do whatever you want with it!

I’ve already opened up a few issues on Github that are enhancements, but this serves as a super simple application to deploy to keep your customers and team informed of system states.

Installation

Download the .war file and deploy it in your favorite container (Tomcat, etc). Once the war file is extracted you can modify the config settings and start it.

To run migrations on an extracted WAR file:

1
2
cd rb_rfo_status/WEB-INF
sudo RAILS_ENV=production BUNDLE_WITHOUT=development:test BUNDLE_GEMFILE=Gemfile GEM_HOME=gems java -cp lib/jruby-core-1.7.1.jar:lib/jruby-stdlib-1.7.1.jar:lib/gems-gems-activerecord-jdbc-adapter-1.2.2.1-lib-arjdbc-jdbc-adapter_java.jar:lib/gems-gems-jdbc-mysql-5.1.13-lib-mysql-connector-java-5.1.13.jar org.jruby.Main -S rake db:migrate

Screenshots

Homepage

Creating an Incident

Updating an incident

A resolved incident

Dealing With Cascading Failures With Chef Server

Chef is awesome. Being able to recreate your entire environment from a recipe is an inredibly powerful tool, and I had started using Chef a few months ago. When I had initially configured the Chef server I hadn’t paid much attention to the couchdb portion of it until I had a chef-server hiccup. Here are a few things to watch out for when running chef-server:

  • Setup CouchDB compaction - Chef had a CouchDB size of 30+GB (after compaction it was only a few megabytes).
  • When resizing instances, make sure you setup RabbitMQ to use a NODENAME. If you don’t you’ll run into an issue with RabbitMQ losing the database’s that were setup (by default, they’re based on hostname… so if you resize a EC2 instance the hostname may change, and you’ll either have to do some moving around or manually set the NODENAME to the previous hostname).
  • Client’s may fail to validate after this - requiring a regeneration of the validation.pem, which is fine since this file is only used for the initial bootstrap of a server.
  • Make sure you run your chef recipes you setup (for instance monitoring) on your chef-server.

I hope these tips will be helpful to other people when they run into a Chef/CouchDB/RabbitMQ issue after a server resize or hostname change. Another really helpful place is #chef on freenode’s IRC servers.

Sidekiq vs Resque, With MRI and JRuby

Before we dive into the benchmarks of Resque vs Sidekiq it will first help to have a better understanding of how forking and threading works in Ruby.

Threading vs Forking

Forking

When you fork a process you are creating an entire copy of that process: the address space and all open file descriptors. You get a separate copy of the address space of the parent process, isolating any work done to that fork. If the forked child process does a lot of work and uses a lot of memory, when that child exits the memory gets free’d back to the operating system. If your programming language (MRI Ruby) doesn’t support actual kernel level threading, then this is the only way to spread work out across multiple cores since each process will get scheduled to a different core. You also gain some stability since if a child crashes the parent can just respawn a new fork, however there is a caveat. If the parent dies while there are children that haven’t exited, then those children become zombies.

Forking and Ruby

One important note about forking with Ruby is that the maintainers have done a good job on keeping memory usage down when forking. Ruby implements a copy on write system for memory allocation with child forks.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
require 'benchmark'

fork_pids = []

# Lets fill up some memory

objs = {}
objs['test'] = []
1_000_000.times do
  objs['test'] << Object.new
end



50.times do
    fork_pids << Process.fork do
        sleep 0.1
    end
end
fork_pids.map{|p| Process.waitpid(p) }
}

We can see this in action here:

However when we start modifying memory inside the child forks, memory quickly grows.

1
2
3
4
5
6
7
8
50.times do
    fork_pids << Process.fork do
      1_000_000.times do
        objs << Object.new
      end
    end
end
fork_pids.map{|p| Process.waitpid(p) }

We’re now creating a million new objects in each forked child:

Threading

Threads on the other hand have considerably less overhead since they share address space, memory, and allow easier communication (versus inter-process communication with forks). Context switching between threads inside the same process is also generally cheaper than scheduling switches between processes. Depending on the runtime being used, any issues that might occur using threads (for instance needing to use lots of memory for a task) can be handled by the garbage collector for the most part. One of the benefits of threading is that you do not have to worry about zombie processes since all threads die when the process dies, avoiding the issue of zombies.

Threading with Ruby

As of 1.9 the GIL (Global Interpreter Lock) is gone! But it’s only been renamed to the GVL (Global VM Lock). The GVL in MRI ruby uses a lock called rb_thread_lock_t which is a mutex around when ruby code can be run. When no ruby objects are being touched, you can actually run ruby threads in parallel before the GVL kicks in again (ie: system level blocking call, IO blocking outside of ruby). After these blocking calls each thread checks the interrupt RUBY_VM_CHECK_INTS.

With MRI ruby threads are pre-emptively scheduled using a function called rb_thread_schedule which schedules an “interrupt” that lets each thread get a fair amount of execution time (every 10 microseconds). [source: thread.c:1018]

We can see an example of the GIL/GVL in action here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
threads = []

objs = []
objs['test'] = []
1_000_000.times do
  objs << Object.new
end

50.times do |num|
  threads << Thread.new do
    1_000_000.times do
      objs << Object.new
    end
  end
end

threads.map(&:join)

Normally this would be an unsafe operation, but since the GIL/GVL exists we don’t have to worry about two threads adding to the same ruby object at once since only one thread can run on the VM at once and it ends up being an atomic operation (although don’t rely on this quirk for thread safety, it definitely doesn’t apply to any other VMs).

Another important note is that the Ruby GC is doing a really horrible job during this benchmark.

The memory kept growing so I had to kill the process after a few seconds.

Threading with JRuby on the JVM

JRuby specifies the use of native threads based on the operating system support using the getNativeThread call [2]. JRuby’s implementation of threads using the JVM means there is no GIL/GVL. This allows CPU bound processes to utilize all cores of a machine without having to deal with forking (which, in the case of resque, can be very expensive).

When trying to execute the GIL safe code above JRuby spits out a concurrency error: ConcurrencyError: Detected invalid array contents due to unsynchronized modifications with concurrent users

We can either add a mutex around this code or modify it to not worry about concurrent access. I chose the latter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
threads = []

objs = {}
objs['test'] = []
1_000_000.times do
  objs['test'] << Object.new
end

50.times do |num|
  threads << Thread.new do
    1_000_000.times do
      objs[num] = [] if objs[num].nil?
      objs[num] << Object.new
    end
  end
end

threads.map(&:join)

Compared to the MRI version, ruby running on the JVM was able to make some optimizations and keep memory usage around 800MB for the duration of the test:

Now that we have a better understanding of the differences between forking and threading in Ruby, lets move on to Sidekiq and Resque.

Sidekiq and Resque

Resque’s view of the world

Resque assumes chaos in your environment. It follows the forking model with C and ruby and makes a complete copy of each resque parent when a new job needs to be run. This has its advantages in preventing memory leaks, long running workers, and locking. You run into an issue with forking though when you need to increase the amount of workers on a machine. You end up not having enough spare CPU cycles since the majority are being taken up handling all the forking.

Resque follows a simple fork and do work model, each worker will take a job off the queue and fork a new process to do the job.

Resque @ Github

Sidekiq’s view of the world

Unlike Resque, Sidekiq uses threads and is extremely easy to use as a drop in replacement to Resque since they both work on the same perform method. When you dig into the results below you can see that Sidekiq’s claim of being able to handle a larger number of workers and amount of work is true. Due to using threads and not having to allocate a new stack and address space for each fork, you get that overhead back and are able to do more work with a threaded model.

Sidekiq follows the actor pattern. So compared to Resque which has N workers that fork, Sidekiq has an Actor manager, with N threads and one Fetcher actor which will pop jobs off Redis and hand them to the Manager. Sidekiq handles the “chaos” portion of Resque by catching all exceptions and bubbling them up to an exception handler such as Airbrake or Errbit.

Now that we know how Sidekiq and Resque work we can get on to testing them and comparing the results.

Sidekiq @ Github

The Test Code

The idea behind the test was to pick a CPU bound processing task, in this case SHA256 and apply it across a set of 20 numbers, 150,000 times.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
require 'sidekiq'
require 'resque'
require 'digest'


# Running: 
# sidekiq -r ./por.rb -c 240 
#
# require 'sidekiq' 
# require './por'
# queueing: 150_000.times { Sidekiq::Client.enqueue(POR, [rand(123098)]*20) }
# queueing: 150_000.times { Resque.enqueue(POR, [rand(123098)]*20) }

class POR
  include Sidekiq::Worker

  @queue = :por

  def perform(arr)
    arr.each do |a|
      Digest::SHA2.new << a.to_s
    end
  end

  def self.perform(arr)
    arr.each do |a|
      Digest::SHA2.new << a.to_s
    end
  end

end

Test Machine

1
2
3
4
5
6
7
8
9
10
  Model Name: Mac Pro
  Model Identifier: MacPro4,1
  Processor Name: Quad-Core Intel Xeon
  Processor Speed: 2.26 GHz
  Number of Processors: 2
  Total Number of Cores: 8
  L2 Cache (per Core): 256 KB
  L3 Cache (per Processor): 8 MB
  Memory: 12 GB
  Processor Interconnect Speed: 5.86 GT/s

This gives us a total of 16 cores to use for our testing. I’m also using a Crucial M4 SSD

Results

Time to Process 150,000 sets of 20 numbers

TypeTime to Completion (seconds)
Sidekiq (JRuby) 150 Threads88
Sidekiq (JRuby) 240 Threads89
Sidekiq (JRuby) 50 Threads91
Sidekiq (MRI) 5x5098
Sidekiq (MRI) 3x50120
Sidekiq (MRI) 50312
Resque 50396

All about the CPU

Resque: 50 workers

Here we can see that the forking is taking its toll on the available CPU we have for processing. Roughly 50% of the CPU is being wasted on forking and scheduling those new processes. Resque took 396 seconds to finish and process 150,000 jobs.

Sidekiq (MRI) 1 process, 50 threads

We’re not fully utilizing the CPU. When running this test it pegged one CPU at 100% usage and kept it there for the duration of the test. We have a slight overhead with system CPU usage. Sidekiq took 312 seconds with 50 threads using MRI Ruby. Lets now take a look at doing things a bit resque-ish, and use multiple sidekiq processes to get more threads scheduled across multiple CPUs.

Sidekiq (MRI) 3 processes, 50 threads

We’re doing better. We’ve cut our processing time roughly in third and we’re utilizing more of our resources (CPUs). 3 Sidekiq processes with 50 threads each (for a total of 150 threads) took 120 seconds to complete 150,000 jobs.

Sidekiq (MRI) 5 processes, 50 threads

As we keep adding more processes that get scheduled to different cores we’re seeing the CPU usage go up even further, however with more processes comes more overhead for process scheduling (versus thread scheduling). We’re still wasting CPU cycles, but we’re completing 150,000 jobs in 98 seconds.

Sidekiq (JRuby) 50 threads

We’re doing much better now with native threads. With 50 OS level threads, we’re completing our set of jobs in 91 seconds.

Sidekiq (JRuby) 150 threads & 240 Threads

We’re no longer seeing a increase in (much) CPU usage and only a slight decrease in processing time. As we keep adding more and more threads we end up running into some thread contention issues with accessing redis and how quickly we can pop things off the queue.

Overview

Even if we stick with the stock MRI ruby and go with Sidekiq, we’re going to see a huge decrease in CPU usage while also gaining a little bit of performance as well.

Sidekiq, overall, provides a cleaner, more object oriented interface (in my opinion) to inspecting jobs and what is going on in the processing queue.

In Resque you would do something like: Resque.size("queue_name"). However, in Sidekiq you would take your class, in this case, POR and call POR.jobs to get the list of jobs for that worker queue. (note: you need to require 'sidekiq/testing' to get access to the jobs method).

The only thing I find missing from Sidekiq that I enjoyed in Resque was the ability to inspect failed jobs in the web UI. However Sidekiq more than makes up for that with the ability to automatically retry failed jobs (although be careful you don’t introduce race conditions and accidentally DOS yourself).

And of course, JRuby comes out on top and gives us the best performance and bang for the buck (although your mileage may vary, depending on the task).

Further Reading

Deploying with JRuby: Deliver Scalable Web Apps using the JVM (Pragmatic Programmers)

JRuby Cookbook

Sidekiq & Resque

Sidekiq

Resque