That's Geeky http://www.thatsgeeky.com Engage your cerebral cortex Mon, 17 Dec 2012 08:24:16 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.1 MCP9701A Thermistor Requires a Stable Input Voltagehttp://www.thatsgeeky.com/2012/09/mcp9701a-thermistor-requires-a-stable-input-voltage/ http://www.thatsgeeky.com/2012/09/mcp9701a-thermistor-requires-a-stable-input-voltage/#comments Sun, 02 Sep 2012 03:57:13 +0000 cyberx86 http://www.thatsgeeky.com/?p=896 Continue reading ]]> I’ve always had a bit of a love/hate relationship with computers, so it is not uncommon for me to go a few months feeling a bit apathetic towards them. Recently, I have been getting more into electronics and microcontrollers (it seems like the perfect next step after computers). So, here is what is, hopefully, the first of many posts on electronics.

I was working on a digital thermometer the other day, based on the MCP9701A (and an ATmega88 AVR and LCD display). This device (described as a low power, linear, active thermistor) provides an output voltage that varies linearly with temperature (in this case, Vout = 400mV + 19.53mV/°C). Testing the device with a multimeter on an isolated circuit gave the expected results (around 1V output on a 30°C day). Strangely though, when connected the microcontroller and LCD display, the output (both as read by the ATmega’s ADC and by a multimeter) was in the range of 2.5V to 4V (which corresponds to 105°C to 185°C) – obviously inaccurate (at those temperatures, I wouldn’t be around to write this).

Another interesting symptom I noted was that, even in the absence of the thermistor, the ADC still read values in excess of 2V. I tried a few approaches, with varying degrees of success:

  • I initially thought that perhaps the ACD pins were configured as an output – this was completely erroneous (in retrospect, it doesn’t even seem logical – both in terms of the values being obtained, and the fact that the ADC was reading values from the pin).
  • Use a pull-down resistor (around 10kΩ, to ground) on the ADC pin – this mostly eliminated the voltage when the thermistor was not connected (readings between 0mV – 15mV), but had no effect when it was connected.
  • Use a small capacitor (2.2nF) between the Vout and ground of the thermistor – no effect on the voltage, but did stabilize the values somewhat.

The problem turned out to be something a bit different. I power my setup using a 7.5V (1A, DC) adapter and a 7805 voltage regulator. Being rather new to the world of electronics (although, I have worked with more ‘traditional’ electrical circuits for years), I hooked up my 7805 and tested it without a load – and got an output voltage between 4.99V and 5.01V, which I was quite happy with. Connected to my microcontroller and LCD though, the voltage dropped to about 4.6-4.7V. Now this seemed a bit peculiar to me, but since everything was running acceptably, it wasn’t something I put much thought into. (Additionally, to me, given a lower input voltage to the thermistor, I would expect a lower output voltage, not one that is significantly higher).

The solution, simply entailed adding a filter capacitor to the voltage regulator. I tried both a 0.1µF and a 1µF capacitor between both Vin and ground and Vout and ground (and both together). All seemed to have essentially the same result – output voltage stabilized in 4.99V to 5.01V range, and the MCP9701A started yielding a reasonable output. (That said, it was still off by a few degrees, but adjusting the formula to assume a higher offset voltage (about 515mV instead of 400mV) seemed to give much more accurate temperatures. There is definitely much more to learn, and many other approaches to improving accuracy, some of which I may explore at a later date, but so far, it seems that, much like with computers, the small things (overlooking a capacitor) are the hardest to diagnose.

References:

  1. MCP9701A Data Sheet – Microchip Technology Inc. [PDF]
]]>
http://www.thatsgeeky.com/2012/09/mcp9701a-thermistor-requires-a-stable-input-voltage/feed/ 0
Adding a Notification to Old WordPress Postshttp://www.thatsgeeky.com/2012/04/adding-a-notification-to-old-wordpress-posts/ http://www.thatsgeeky.com/2012/04/adding-a-notification-to-old-wordpress-posts/#comments Thu, 05 Apr 2012 17:06:52 +0000 cyberx86 http://www.thatsgeeky.com/?p=867 Continue reading ]]> While some articles may be timeless, depending on the type of content being published, many posts have a definite expiry date. It is the double edged sword of rapidly evolving technologies – they improve swiftly, but articles written about them become outdated just as fast. This site, for instance, is primarily about cloud computing and servers – as such, many of the posts, from even a few months ago, may no longer be applicable.

When searching for technology related articles, I usually restrict the search to the past year. Articles without a date are simply annoying, and those where the date is in some obscure location aren’t much better.

A common approach to dealing with outdated posts is to add a notice to them. The idea being that there is likely some useful information in them, despite their age, but that readers should be aware that the post was from some time ago and some things may have changed in the interim.

I’d suggest there are (at least) three ways of accomplishing the above:

  1. Manually edit posts
    • When you feel that a post merits such a notice, add one in. This may be a good approach if there are only one or two posts that actually need it. For any site that generates more than a few of these, a more automated approach would be better.
  2. Edit your theme.
    • This seems to be the commonly proposed solution, add a few lines of code to the theme to display the notification. It is quite easy to implement, although, it may take a few moments to find the right file. The problem is that any updates to the theme will overwrite your modifications. Arguably, the solution may lie in the form of a child theme, but really, you only want to be modifying a small part of an existing function, and not completely replacing that function. This site uses a child theme with Twenty Eleven, and I added the following to content-single.php, immediately preceding <div class="entry-content">:
      <?php
      if((get_the_time('U') < strtotime('3 months ago'))) { 
      	$now = new DateTime();
      	$ref = new DateTime(get_the_time('r'));
      	$diff = $now->diff($ref);
      	$interval = "";
      	if ($diff->y){$interval .= $diff->y . " year" . ($diff->y == 1?"":"s");}
      	if ($diff->m){$interval .= ($interval?", ":"") . $diff->m . " month" . ($diff->m == 1?"":"s");} ?>
       
      	<div class="old-post">
      		This article was published <strong><?php echo $interval; ?> ago</strong>. Due to the rapidly evolving world of technology, some concepts may no longer be applicable.
      	</div>
      <?php } ?>
  3. Use a plugin. This would be the way to go – it will withstand updates, and offer a degree of customization – without having to hardcode values into your theme. The problem is, it is rather difficult to come across such a plugin. At least one, however exists – Old Posts Notifier. However, it was written for the 2.x versions of WordPress, hasn’t been updated in over 2 years, and has a bit of undesired tracking included in it.

I had a bit of spare time earlier in the week, so I extracted my in-line modifications into a plugin – ‘Aged Posts’, and added a few things to make it more useable. Essentially, I wanted to be able to:

  • Change the displayed text and style from a settings page
  • Change the post age limit for displaying the notification.
  • Exclude individual posts from displaying the notification.

The result is fairly simple – it doesn’t have a lot of options, but it is functional. The plugin saves the style to a css file – which allows it to be included in any minification performed.

  • Note: This plugin requires PHP > 5.3, since it uses DateTime::diff().
  • Tested with WordPress 3.2 and 3.3, with the TwentyTen and TwentyEleven themes.
  • This plugin is NOT internationalized – I might get around to if people find it useful.Likewise, I haven’t yet posted on WordPress’ plugins site, but may do so in the future.
    • Update (Oct 9, 2012): Added some form of i18n to the plugin at the prompting of ‘giacomo’ (commenter, below) – might be a bit rough (and is somewhat untested), but it appears to work.

    Undoubtedly there are some non-standard code practices used (it is my first WordPress plugin), but hopefully nothing of major concern.

    If you find a bug, have an improvement or suggestion, or simply found it useful, please let me know.

    Translations

    If you are using the i18n version, translations are supported through the use of PO/MO files. You can add PO/MO files to the plugin’s languages folder, or under WP_LANG_DIR/agedposts/ (which will be used in preference to the plugin’s language folder). Using a directory external to the plugin’s own directory allows for upgrades to not overwrite custom translations.

    Italian (courtesy of Giacomo)

    The Code

    Download (no i18n): agedposts.zip [39.2kB, MD5: 683B0D1F79C7ABA2749F3233EA6F8D7B])
    Download (with i18n): agedposts-i18n [43.8kB, MD5: 7780659593F181626CCDCFDCC9558053])

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    140
    141
    142
    143
    144
    145
    146
    147
    148
    149
    150
    151
    152
    153
    154
    155
    156
    157
    158
    159
    160
    161
    162
    163
    164
    165
    166
    167
    168
    169
    170
    171
    172
    173
    174
    175
    176
    177
    178
    179
    180
    181
    182
    183
    184
    185
    186
    
    <?php
    /*
    Plugin Name: Aged Posts
    Version: 0.0.1
    Plugin URI: http://www.thatsgeeky.com/2012/04/adding-a-notification-to-old-wordpress-posts
    Description: Add a notice to old posts indicating that content may be outdated.
    Author: cyberx86
    Author URI: http://www.thatsgeeky.com
    License: GPL2
    */
     
    if (!class_exists('AgedPosts')) {
    	function call_AgedPosts()
    	{
    		return new AgedPosts();
    	}
    	add_action('plugins_loaded', 'call_AgedPosts');
    	register_activation_hook(__FILE__, array('AgedPosts', 'install'));
     
    	class AgedPosts
    	{
    		var $disp_interval = "";
    		var $interval = "";
    		var $prefix = "agedposts_";
     
    		function __construct()
    		{
    			if (is_admin()) {
    				add_action('save_post', array(&$this,'meta_save'));
    				add_action('admin_menu', array(&$this,'admin_menu'));
    			} else {
    				add_action('wp', array(&$this,'main'));
    			}
    		}
    		function install()
    		{
    			$prefix = "agedposts_";
    			$default_options = array(
    				$prefix . 'age' => 3,
    				$prefix . 'age_units' => 'months',
    				$prefix . 'priority' => 20,
    				$prefix . 'display_text' => 'This post was published <strong>@@INTERVAL@@ ago</strong>. Some material it contains may no longer be applicable.'
    			);
     
    			foreach ($default_options as $option_name => $option_value) {
    				if (get_option($option_name) === FALSE) {
    					add_option($option_name, $option_value, '', 'no');
    				}
    			}
     
    			$agedposts_style = get_option($prefix . 'style');
    			if ($agedposts_style === FALSE) {
    				$agedposts_style = file_get_contents(plugin_dir_path(__FILE__) . "style.css.orig");
    				add_option($prefix . 'style', $agedposts_style, '', 'no');
    			}
    			file_put_contents(plugin_dir_path(__FILE__) . "style.css", $agedposts_style);
    		}
    		function main()
    		{
    			if (is_single()) {
    				if (get_post_meta(get_the_ID(), $this->prefix . 'include', true) !== '0') {
    					$priority       = get_option($this->prefix . 'priority');
    					$timestamp      = get_the_time('U');
    					$age            = intval(get_option($this->prefix . 'age'));
    					$age_units      = get_option($this->prefix . 'age_units');
    					$this->interval = "$age $age_units ago";
    					if ($timestamp < strtotime($this->interval)) {
    						add_filter('the_content', array(&$this,	'display'), $priority);
    						add_action('wp_enqueue_scripts', array(&$this,'add_stylesheet'));
    						$now  = new DateTime();
    						//Todo: Add PHP 5.2 compatibility
    						$diff = $now->diff(new DateTime('@' . $timestamp));
    						if ($diff->y) {
    							$this->disp_interval .= $diff->y . " year" . ($diff->y == 1 ? "" : "s");
    						}
    						if ($diff->m) {
    							$this->disp_interval .= ($this->disp_interval ? ", " : "") . $diff->m . " month" . ($diff->m == 1 ? "" : "s");
    						}
    					}
    				}
    			}
    		}
    		function display($content)
    		{
    			$display_text = get_option($this->prefix . 'display_text');
    			$content      = '<div class="'.$this->prefix . 'old-post">' . str_replace("@@INTERVAL@@", $this->disp_interval, $display_text) . '</div>' . $content;
    			return $content;
    		}
    		function add_stylesheet()
    		{
    			wp_register_style($this->prefix . 'style', plugins_url('style.css', __FILE__));
    			wp_enqueue_style($this->prefix . 'style');
    		}
    		function add_ap_meta_box()
    		{
    			add_meta_box($this->prefix . 'meta', __('Aged Posts'), array(&$this,'render_meta_box_content'), 'post', 'side');
    		}
    		function render_meta_box_content($post)
    		{
    			wp_nonce_field(plugin_basename(__FILE__), $this->prefix . 'nonce');
    			echo '<label for="'.$this->prefix .'include"><input type="checkbox" id="'.$this->prefix .'include" name="'.$this->prefix .'include" value="1"' . (get_post_meta($post->ID, $this->prefix .'include', true) === '0' ? '' : 'checked') . '/>&nbsp;Display "old post" notice</label>';
    		}
    		function meta_save($post_id)
    		{
    			if (defined('DOING_AUTOSAVE') && DOING_AUTOSAVE) {
    				return;
    			}
    			if (!wp_verify_nonce($_POST[$this->prefix . 'nonce'], plugin_basename(__FILE__))) {
    				return;
    			}
    			if ('page' == $_POST['post_type']) {
    				if (!current_user_can('edit_page', $post_id)) {
    					return;
    				}
    			} else {
    				if (!current_user_can('edit_post', $post_id)) {
    					return;
    				}
    			}
    			$agedposts_include = ($_POST[$this->prefix . 'include'] ? 1 : 0);
    			if (!add_post_meta($post_id, $this->prefix . 'include', $agedposts_include, true)) {
    				update_post_meta($post_id, $this->prefix . 'include', $agedposts_include);
    			}
    		}
    		function admin_menu()
    		{
    			add_options_page('Aged Posts Options', 'Aged Posts', 'manage_options', $this->prefix , array(&$this,'admin_options'));
    			add_action('add_meta_boxes', array(&$this,'add_ap_meta_box'));
    		}
    		function admin_options()
    		{
    			if (!current_user_can('manage_options')) {
    				wp_die(__('You do not have sufficient permissions to access this page.'));
    			}
     
    			if (isset($_POST[$this->prefix . 'submit'])) {
    				update_option($this->prefix . 'age', $_POST[$this->prefix . 'age']);
    				update_option($this->prefix . 'age_units', $_POST[$this->prefix . 'age_units']);
    				update_option($this->prefix . 'priority', $_POST[$this->prefix . 'priority']);
    				update_option($this->prefix . 'display_text', $_POST[$this->prefix . 'display_text']);
    				update_option($this->prefix . 'style', $_POST[$this->prefix . 'style']);
    				if(file_put_contents(plugin_dir_path(__FILE__) . "style.css", $_POST[$this->prefix . 'style'])===FALSE){
    					?><div class="error"><p><strong>Unable to write stylesheet.</strong> Please manually update <code><?php echo plugin_dir_path(__FILE__) . "style.css"; ?></code></p></div><?php
    				}else{
    					?><div class="updated"><p><strong>Settings saved.</strong></p></div><?php
    				}
    			}
    			?><div class="wrap">
    			  <form name="<?php echo $this->prefix; ?>options" method="post">
    				<h2>Aged Posts Options</h2>
    				<h3>Post Age</h3>
    				<p>Display notice on posts older than:
    				  <input type="text" id="<?php echo $this->prefix; ?>age" name="<?php echo $this->prefix; ?>age" value="<?php
    						echo get_option($this->prefix . 'age');
    						$agedposts_age_units = get_option($this->prefix . 'age_units'); ?>"/>
    				  <select name="<?php echo $this->prefix; ?>age_units" id="<?php echo $this->prefix; ?>age_units">
    					<option value="weeks"<?php echo ($agedposts_age_units == 'weeks' ? ' selected="selected"' : ''); ?>>Weeks</option>
    					<option value="months"<?php	echo ($agedposts_age_units == 'months' ? ' selected="selected"' : '') ;?>>Months</option>
    					<option value="years"<?php echo ($agedposts_age_units == 'years' ? ' selected="selected"' : ''); ?>>Years</option>
    				  </select>
    				</p>
    				<h3>Priority</h3>
    					<p>The priority is used, by WordPress, to determine the order in which filter functions are run. Lower values result in later execution. The default for most filters is 10; the default for this plugin is 20. Only change this value if you find other plugins are conflicting with this one.</p>
    					<p>Priority:  <input type="text" id="<?php echo $this->prefix; ?>priority" name="<?php echo $this->prefix; ?>priority" value="<?php echo get_option($this->prefix . 'priority');?>"/> </p>
    				<h3>Text to Display</h3>
    				<p>Text to display (@@INTERVAL@@ will be replaced with the number of months/years):<br/>
    				  <textarea id="<?php echo $this->prefix; ?>display_text" name="<?php echo $this->prefix; ?>display_text" cols="100" rows="2"/><?php 
    					echo get_option($this->prefix . 'display_text'); 
    				  ?></textarea>
    				</p>
    				<h3>Appearance</h3>
    				<p>The following is the CSS class definition will be used to style the display. It has been loaded from the database, and will be saved to a file.:<br />
    				  <textarea id="<?php echo $this->prefix; ?>style" name="<?php echo $this->prefix; ?>style" cols="100" rows="15"/><?php
    					echo get_option($this->prefix . 'style'); 
    				  ?></textarea>
    				</p>
    				<hr />
    				<p class="submit">
    				  <input type="submit" name="<?php echo $this->prefix; ?>submit" class="button-primary" value="<?php esc_attr_e('Save Changes'); ?>" />
    				</p>
    			  </form>
    			</div><?php
    		}
    	}
    }
    ?>

    Download (no i18n): agedposts.zip [39.2kB, MD5: 683B0D1F79C7ABA2749F3233EA6F8D7B])
    Download (with i18n): agedposts-i18n [43.8kB, MD5: 7780659593F181626CCDCFDCC9558053])

    ]]> http://www.thatsgeeky.com/2012/04/adding-a-notification-to-old-wordpress-posts/feed/ 20 WebSockets – Varnish, Nginx, and Node.jshttp://www.thatsgeeky.com/2012/03/websockets-varnish-nginx-and-node-js/ http://www.thatsgeeky.com/2012/03/websockets-varnish-nginx-and-node-js/#comments Mon, 26 Mar 2012 17:28:44 +0000 cyberx86 http://www.thatsgeeky.com/?p=860 Continue reading ]]> Like many others I have been drawn in by the appeal of websockets and their use in (near) real-time communication. As such one of my current projects uses Node.js and websockets (via socket.io). To maximize compatibility, I would, of course, like my Node.js site to run on port 80. My server, however, is not used exclusively for this project – it also has traditional PHP/MySQL sites running on it. Which brings me to my problem:

    My current setup has Varnish as a caching layer – to cache the dynamic PHP scripts – and Nginx as a webserver. Together, these have excellent performance. My objective, in adding in Node.js is to have it running behind both of these. Varnish binds to port 80 and provides the publicly accessible interface to all sites on the server – this will allow me to cache dynamically generated content from Node.js as well as the content that is currently cached. Nginx will serve my static content – from some brief tests, it appears to far outperform Node.js in this area. Finally, any requests for dynamic content or websockets will be handled by Node.js.

    As is good practice, static content will be served from a separate subdomain, but I would like all remaining content (including the websockets) to be served from the main domain. The rest of this article outlines the configurations I have in place to attain the above.

    To recap, the objectives are:

    • Have a single public port for both websocket and ‘regular’ data
    • Be able to optionally cache some resources using Varnish
    • Serve (uncached) static assets directly from nginx (which may be then be cached by Varnish)
    • Pass requests for ‘web pages’ to nginx, and from their proxy to Node.js
    • Pass websocket requests directly (from Varnish) to Node.js (bypassing nginx).

    My server stack is

    • Varnish (v3.0.2) – port 80
    • Nginx (v1.0.14) – port 81
    • Node.js (v0.6.13) – port 1337
      • Socket.io (v0.9.2)
      • Express (v2.5.8)
    • Operating system is Amazon’s Linux (v2011.09)
      • Also tested on CentOS (v6.2)

    Varnish

    Below is an edited version of my /etc/varnish/default.vcl. Some customizations and parts irrelevant to the topic at hand have been edited out.

    #define backends and timeouts
    backend default {
        .host = "127.0.0.1";
        .port = "81";
        .connect_timeout = 5s;
        .first_byte_timeout = 30s;
        .between_bytes_timeout = 60s;
        .max_connections = 800;
    }
    backend nodejs{
        .host = "127.0.0.1";
        .port = "1337";
        .connect_timeout = 1s;
        .first_byte_timeout = 2s;
        .between_bytes_timeout = 60s;
        .max_connections = 800;
    }
    
    #Removed: ACL for purging
    
    sub vcl_recv {
        set backend = default;
        set req.grace = 120s;
        
        #set the correct IP so my backends don’t log all requests as coming from Varnish
        if (req.restarts == 0) {
            if (req.http.x-forwarded-for) {
                set req.http.X-Forwarded-For =
                req.http.X-Forwarded-For + ", " + client.ip;
            } else {
                set req.http.X-Forwarded-For = client.ip;
            }
        }
        
        #remove port, so that hostname is normalized
        set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
        
        #Removed: code for purging
        
        #part of Varnish’s default config
        if (req.request != "GET" &&
            req.request != "HEAD" &&
            req.request != "PUT" &&
            req.request != "POST" &&
            req.request != "TRACE" &&
            req.request != "OPTIONS" &&
            req.request != "DELETE") {
            /* Non-RFC2616 or CONNECT which is weird. */
            return (pipe);
        }
        if (req.request != "GET" && req.request != "HEAD") {
            return (pass);
        }
        
        #pipe websocket connections directly to Node.js
        if (req.http.Upgrade ~ "(?i)websocket") {
            set req.backend = nodejs;
            return (pipe);
        }
        
        #do not cache large static files
        if (req.url ~ "\.(avi|flv|mp(e?)g|mp4|mp3|gz|tgz|bz2|tbz|ogg)$") {
            return(pass);
        }
        
        #general URL manipulation and cookie removal
        #lines 60-109 from https://github.com/mattiasgeniar/varnish-3.0-configuration-templates/blob/d86d6c1d7d3d0ddaf92019dd5ef5ce66c9e53700/default.vcl
        
        if(req.http.Host ~"^(www\.)?example.com"){
        #Removed: Redirect for URL normalization using error 701
        # Requests made to this path, relate to websockets - pass does not seem to work (even for XHR polling)
        if (req.url ~ "^/socket.io/") {
            set req.backend = nodejs;
            return (pipe);
        }
        #My other PHP/MySQL sites get included here, each in its own block
        }else if (req.http.Host ~ "^(www\.)?thatsgeeky.com") {
            #...
        }
        
        # part of Varnish’s default config
        if (req.http.Authorization || req.http.Cookie) {
            /* Not cacheable by default */
            return (pass);
        }
        return (lookup);
    }
    
    sub vcl_pipe {
        #we need to copy the upgrade header
        if (req.http.upgrade) {
            set bereq.http.upgrade = req.http.upgrade;
        }
        #closing the connection is necessary for some applications – I haven’t had any issues with websockets keeping the line below uncommented
        #set bereq.http.Connection = "close";
         return (pipe);
    }
    
    # sub vcl_pass - unmodified
    # sub vcl_hash - mostly modified – added hash by content-encoding
    # sub vcl_hit - mostly unmodified – added PURGE code
    # sub vcl_miss – mostly unmodified – added PURGE code
    # sub vcl_fetch - mostly unmodified – added set beresp.grace = 30m; and some site specific additions
    # sub vcl_deliver - modify some headers
    # sub vcl_error - custom error page and handle redirects for URL normalization
    # sub vcl_init - unmodified
    # sub vcl_fini – unmodified

    Since Nginx does not handle websocket requests (although, there is a TCP module that may help with this), we cannot send websocket requests to Nginx – they must go directly to Node.js. As such, we must setup two backend definitions – one for Nginx and one for Node.js. The specific timeout parameters for each backend are a personal preference and are largely arbitrary.

    As with most setups, vcl_recv is the function with the most going on. In addition to the standard parts found in the Varnish config, the above looks for websocket connections and will send them directly to Node.js (the code come directly from the Varnish documentation). It should be mentioned that I do not let Node.js serve the socket.io client. My pages call it from a different location and it is served by Nginx.

    Nginx

    The config below is simply the section that I include for the one site – common options (from nginx.conf) are not included.

    upstream node_js {
        server 127.0.0.1:1337;
        server 127.0.0.1:1337;
    }
    server {
        listen *:81;
        server_name example.com www.example.com static.example.com;
        root /var/www/example.com/web;
        error_log /var/log/nginx/example.com/error.log info;
        access_log /var/log/nginx/example.com/access.log timed;
        
        #removed error page setup
        
        #home page
        location = / {
            proxy_pass http://node_js;
        }
        
        #everything else
        location / {
            try_files $uri $uri/ @proxy;
        }
        location @proxy{
            proxy_pass http://node_js;
        }
        
        #removed some standard settings I use
    }

    Firstly, with the upstream block, we define our backend servers. You’ll note that I have the same server listed twice. This is because of the way Nginx falls back in the event of a backend failure, and will give the request another try.

    The objective with the above, is to serve all files that exist using nginx, and to proxy all other requests to Node.js. The use of the location = / block is due to the root directory existing, but wanting it handled by Node.js and not Nginx.

    In order to track timings through each layer, I use a modified log command (timed, above, instead of main). Also, the IP addresses are updated so that Node.js doesn’t see all requests as originating from Nginx.

    set_real_ip_from 127.0.0.1;
    real_ip_header X-Forwarded-For;
    
    log_format timed '$remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_referer" '
                     '"$http_user_agent" $request_time $upstream_response_time $pipe';
    
    port_in_redirect off;

    One of the nice things about this setup, is that all data passing though Nginx can be compressed – without needing to add modules to Node.js which will undoubtedly be slower.

    Tracking a Request

    Just for interest sake, it is possible to track a request through the entire server stack – I occassionally do this to track down the cause of any delays. The following request was for a login page of a Node.js site I am working on.

    Varnish:
    Logging is done with varnishncsa, using the following:

    varnishncsa -F "%h %l %u %t \"%m %U %H\" %s %b \"%{Referer}i\" %{X-Varnish}o %{Varnish:time_firstbyte}x"

    The logged request is as follows:

    xxx.xxx.xxx.xxx - - [26/Mar/2012:12:11:14 -0400] "GET /login HTTP/1.1" 200 601 "-" 1866086403 0.006932020

    Matching the XID (1866086403) to output from varnishlog, gives the full request timings:

    11 ReqEnd c 1866086403 1332778274.036613464 1332778274.043595552 0.000080347 0.006932020 0.000050068

    Looking at this a bit more closely, we see that Varnish took:

    • 0.000080347s from the time the request was accepted until processing started
    • 0.006932020s from the start of processing to the start of delivering (essentially backend time)
    • 0.000050068s from the start of delivery to the end of the request

    Nginx:
    The log is generated using the log format mentioned earlier that I reference with the name ‘timed‘:

    log_format timed '$remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_referer" '
                     '$request_time $upstream_response_time';

    The logged request shows up as follows:

    xxx.xxx.xxx.xxx - - [26/Mar/2012:12:11:14 -0400] "GET /login HTTP/1.1" 200 613 "-" 0.006 0.006 .

    We can see that Nginx took:

    • 0.006s ($request_time) in total to process the request
    • 0.006s ($upstream_response_time) of the total time was taken to obtain the response from the upstream server.

    Node.js:
    As part of my application, I have the following line (which sends the output to the file defined in access_logfile:

    app.use(express.logger({format: ':req[X-Forwarded-For] - - [:date] ":method :url HTTP/:http-version" :status :res[content-length] - :response-time ms', stream: access_logfile }));

    The logged response is:

    xxx.xxx.xxx.xxx - - [Mon, 26 Mar 2012 16:11:14 GMT] "GET /login HTTP/1.0" 200 1155 - 4 ms

    From the above, it appears that Node.js took:

    • 4ms to process and return the request

    Obviously the time format used by express.logger is a bit different (and I am recording less information), but the relevant data is present.

    An interesting observation here is the change in protocol. The request came in as HTTP/1.1, was sent to Nginx as the same, but Nginx sent the request as HTTP/1.0. This is well documented, so not exactly a surprize – but interesting to see regardless.

    Another point to note is the changing response size. The original response from Node.js was 1155 bytes. After going through Nginx it came out at 613 bytes (since it was gzipped). Finally, my VCL modifies some of the headers (not shown above) for a final size of 601 bytes.

    ]]>
    http://www.thatsgeeky.com/2012/03/websockets-varnish-nginx-and-node-js/feed/ 5
    DRBD on Amazon’s Linuxhttp://www.thatsgeeky.com/2012/03/drbd-on-amazons-linux/ http://www.thatsgeeky.com/2012/03/drbd-on-amazons-linux/#comments Sat, 24 Mar 2012 15:56:50 +0000 cyberx86 http://www.thatsgeeky.com/?p=845 Continue reading ]]> Note: this was done more as an experiment than for something I intended to use in production – so consider it to be more a compilation of notes than a full out procedure.

    DRBD – Distributed Replicated Block Device – is a kernel level storage system that replicates data across a network. It uses TCP – and typically runs on port(s) starting at 7788. A typical setup will pair DRBD with Heartbeat/Corosync, so that in the event of the failure of a node, the other node can be promoted to primary (or will use a dual-primary setup), and a network filesystem so that both nodes can access the data simultaneously.

    The setup described below will only allow one node to access the data at any given time and requires a manual failover to promote the secondary node to primary.

    For the following, I am using 2 up-to-date instances running Amazon’s Linux AMI 2011.09 (ami-31814f58) – which is derived from CentOS/RHEL. Both are in the same security group, and these are the only two instances in that security group. Also the hostnames of both instances are unchanged from their default – this is only relevant if you try to use the script included below – if you manually setup the configuration, the hostnames can be whatever you wish.

    I have attached one EBS volume to each instance (in addition to the root volume), at /dev/sdf (which is actually /dev/xvdf on Linux).

    Install DRBD

    Note: all steps in the section are to be performed on both nodes

    This AMI already includes the DRBD kernel module in its default kernel. You can verify this with the following:

    modprobe -l | grep drbd
    kernel/drivers/block/drbd/drbd.ko

    Likewise, to find the version of the kernel module, you can use:

    modinfo drbd | grep version
    version:        8.3.8

    It is typically preferable to have the version of the kernel module match the version of the userland binaries. DRBD is no longer included in the CentOS 6 repository – and is not in either the amzn or EPEL repositories. The remaining options, are to therefore use another repository or to build from source – I’d favour the former.

    ElRepo – which contains primary hardware related packages – maintains up to date binaries for CentOS and its derivatives – we can either install a specific RPM or simply use the latest copy from the repository.

    rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org

    From RPM (for 32-bit version):

    rpm -Uvh http://elrepo.org/linux/elrepo/el5/i386/RPMS/drbd83-utils-8.3.8-3.el5.elrepo.i386.rpm

    From Repository (current version 8.3.12 – doesn’t match installed kernel version 8.3.8):

    rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm
    yum install drbd83-utils

    Load the kernel module with:

    modprobe -v drbd

    Setup meta-data storage

    Note: all steps in the section are to be performed on both nodes

    DRBD can store meta-data internally or externally. Internal storage tends to be easier to recover, while external storage tends to offer better latency. Moreover, for EBS volumes using an XFS filesystems with existing data, external meta-data is required (since there is typically no place to store the meta-data on the disk – XFS can’t shrink, and EBS can’t be enlarged directly).

    According to the DRBD User Guide, meta-data size, in sectors, can be calculated with:

    echo $(((`blockdev --getsz /dev/xvdf`/32768)+72))

    However, for external meta data disks, it appears that you need 128MiB per index (disk). Creating a smaller disk will result in the error “Meta device too small”.

    To create our meta-data storage (/var/drbd-meta – change as desired) – initially zeroed out – we will use dd, with /dev/zero as an input source and then mount the file on a loopback device.

    dd if=/dev/zero of=/var/drbd-meta bs=1M count=128
    losetup /dev/loop0 /var/drbd-meta

    Configure DRBD

    The default DRBD install creates /etc/drbd.conf – which includes /etc/drbd.d/global_common.conf and /etc/drbd.d/*.res. You will want to make some changes to global_common.conf – for performance and error handling, but for now I am just using the default.

    You will need to know the hostname and IP address of both instances in your cluster to setup a resource file. It is important to note that DRBD uses IP address of the local machine to determine which interface to bind to – therefore, you must use the private IP address for the local machine.

    You can of course, use an elastic IP as the public IP address. The default port used by DRBD is 7788, and I have used the same, below – you need to open this port (TCP) in your security group.

    Setup a resource file /etc/drbd.d/drbd_res0.res.tmpl (on both nodes):

    resource drbd_res0 {
    syncer {rate 50M;}
    device     /dev/drbd0;
    disk       /dev/xvdf;
    meta-disk  /dev/loop0[0];
    on @@LOCAL_HOSTNAME@@ {
        address    @@LOCAL_IP@@:7788;
    }
    on @@REMOTE_HOSTNAME@@ {
        address    @@REMOTE_IP@@:7788;
    }
    }

    The above ‘resource’ defines the basic information about the disk and the instances. Note: you should change the ‘disk’ to match the device name you attached your EBS volume as, and ‘meta-disk’ should correspond to the device setup above (or use internal).

    If you manually replace the template placeholders, above, you must use the private IP address for the LOCAL_IP, however, you can use either the public or private IP for the REMOTE_IP. The LOCAL_HOSTNAME and REMOTE_HOSTNAME values should match the output of the hostname command on each system. Keep in mind that if you are using a public IP address, you may incur data transfer charges (also keep in mind that an elastic IP maps to the private IP address at times, which will save on data transfer charges). Also the file extension should be .res (not .tmpl) if you make the replacement manually.

    A typical setup would have identical resource files on both the local and remote machines. If we wish to use the public IP addresses, this is not possible (since the public IP is not associated with an interface in EC2). Therefore, I used the following script to setup the correct values in the above file (note, you need to setup your private key and certificate in order to use the API tools):

    #!/bin/sh
    
    export EC2_PRIVATE_KEY=/path/to/pk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.pem 
    export EC2_CERT=/path/to/cert-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.pem
    
    REMOTE_INFO=$(ec2-describe-instances --filter instance-state-name=running --filter group-name=$(curl -s http://169.254.169.254/latest/meta-data/security-groups) | grep INSTANCE | grep -v $(curl -s http://169.254.169.254/latest/meta-data/instance-id) | awk '{sub(/\..*/, "", $5);print $5, $14}')
    
    REMOTE_HOSTNAME=$(echo $REMOTE_INFO | cut -d ' ' -f1)
    REMOTE_IP=$(echo $REMOTE_INFO | cut -d ' ' -f2)
    LOCAL_HOSTNAME=$(hostname)
    LOCAL_IP=$(ifconfig eth0 | grep "inet addr" | cut -d':' -f2 | cut -d' ' -f1)
    
    sed -e "s/@@LOCAL_HOSTNAME@@/$LOCAL_HOSTNAME/g" \
    -e "s/@@LOCAL_IP@@/$LOCAL_IP/g" \
    -e "s/@@REMOTE_HOSTNAME@@/$REMOTE_HOSTNAME/g" \
    -e "s/@@REMOTE_IP@@/$REMOTE_IP/g" \
    /etc/drbd.d/drbd_res0.res.tmpl > /etc/drbd.d/drbd_res0.res

    Of course, there are a few shortcomings to the above – it will only handle two instances (the local and one remote) in the group and it expects the hostname to be unchanged (i.e. the value derived from ec2-describe-instances). The above script uses the security group to determe the servers in the. As such, it requires both instances to be in the same security group and will only work if that security group has exactly two instances in it. (It would be trivial to modify it to use something other than security group – for instance a specific tag, but handling more than two instances matching the criteria would take a bit more effort).

    At this point you should have an /etc/drbd.d/drbd_res0.res file on both nodes, with the appropriate information filled in (either manually or using a script) – it is worth mentioning that the filename doesn’t actually matter (as long as it ends in .res – which is what /etc/drbd.conf is setup to look for).

    Final steps

    We are just about done at this point – everything is configured, and DRBD is setup on each instance. We now need to actually create the meta-data disk for our specific resource (run on both nodes):

    drbdadm create-md drbd_res0

    Finally, we start DRBD (on both nodes):

    service drbd start

    We can find the status of our nodes, either by using service drbd status, drbd-overview, or cat /proc/drbd:

    version: 8.3.8 (api:88/proto:86-94)
    srcversion: 299AFE04D7AFD98B3CA0AF9
     0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
        ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:1048576

    At this point, we have not actually defined which node is to be the primary node – both are therefore classed as secondary, something we will resolve momentarily.

    Up until this point, all steps have been done on both instances. Without a dual-primary/network file system setup, the DRBD files will only be accessible to one instance at a time. The primary node will be able to read and write to the volume, but the secondary node will not. In a failover scenario, we would promote the secondary node to primary, and it will then have full access to the volume.

    We must now promote one node to primary. It is important to note that you cannot promote a node to primary if the nodes are inconsistent (see the status above). To do so, initially, you will need to use the --overwrite-data-of-peer option. Be careful, as this option will completely overwrite the data on the other node:

    drbdadm -- --overwrite-data-of-peer primary drbd_res0

    If the nodes are UpToDate, you can use:

    drbdadm -- primary drbd_res0

    Checking the status of our nodes, will now reveal, that one is primary, and if necessary, a sync may be in progress:

    version: 8.3.8 (api:88/proto:86-94)
    srcversion: 299AFE04D7AFD98B3CA0AF9
     0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
        ns:88968 nr:0 dw:0 dr:97432 al:0 bm:5 lo:5 pe:17 ua:248 ap:0 ep:1 wo:b oos:960128
            [>...................] sync'ed:  9.0% (960128/1048576)K delay_probe: 0
            finish: 0:00:32 speed: 29,480 (29,480) K/sec

    Wait for the sync to finish before proceeding – at which point there should be 0 bytes out of sync (oos:0), and both nodes should be UpToDate:

    version: 8.3.8 (api:88/proto:86-94)
    srcversion: 299AFE04D7AFD98B3CA0AF9
     0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
        ns:1048576 nr:0 dw:0 dr:1049240 al:0 bm:64 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

    Filesystem and Mounting

    At this point, we are ready to use our DRBD device. We start by setting up a filesystem. My preference is XFS:

    yum install xfsprogs
    mkfs.xfs /dev/drbd0

    (Note: both nodes should have xfsprogs installed if you use XFS as your filesystem – but you will only format the device on the primary node).

    We now create a mountpoint and mount the device (again, only on the primary node):

    mkdir /data
    mount /dev/drbd0 /data

    Hopefully, at this point everything is setup and operational – any data we save to /data should now be replicated over the network to our secondary node.

    A Quick Test

    The most basic test involves the following – create a test file on the primary node, manually failover, and check for the file on what was originally the secondary node:

    On the primary node:

    echo "This is a test" > /data/test.txt
    umount /data
    drbdadm secondary drbd_res0

    On the secondary node:

    drbdadm primary drbd_res0
    mkdir /data
    mount /dev/drbd0 /data
    cat /data/test.txt

    To be able to simultaneously access the data on both nodes, we need to setup both nodes as primary, and use a network file system – such as OCFS2 or GFS2 (instead of XFS), in order to minimize the risk of inconsistencies. That, however, is an experiment for a future date. (Of course, there are other alternatives to DRBD – my personal preference being GlusterFS on EC2, which, while having a bit of additional overhead, is simpler to setup and has quite a few more features).

    References

    ]]>
    http://www.thatsgeeky.com/2012/03/drbd-on-amazons-linux/feed/ 0
    Varnish – Nothing but 503shttp://www.thatsgeeky.com/2012/03/varnish-nothing-but-503s/ http://www.thatsgeeky.com/2012/03/varnish-nothing-but-503s/#comments Mon, 12 Mar 2012 10:00:06 +0000 cyberx86 http://www.thatsgeeky.com/?p=836 Continue reading ]]> I use Varnish on my production server without any issues – it works quite well, and I have come to consider it an essential component in my server stack. I have recently been having a bit of trouble with a new project of mine that I currently believe a misconfigured Varnish instance to be responsible for. As such, I wanted to reproduce the problem in my test environment, but ran into a completely different problem attempting to do so.

    In its most basic form, I have Nginx listening on port 81 (all interfaces), and varnish listening on port 80 (all interfaces). Essentially, Varnish receives a request – checks its cache, and either fetches the request from cache if it is a hit, or from the backend if it is a miss.

    The problem was that Varnish simply did not acknowledge the existence of the backend, continually returning ’503 Service Unavailable’. The following is an account of my diagnostics and resolution to this problem. (Please note: this is a different problem than an intermittent 503 error – I was never once able to successfully load a page through Varnish before I fixed this problem).

    Environment:

    • Virtualbox 4.1.8r75647
    • Host operating system: Windows 7 SP1 (64-bit)
    • Guest operating system: CentOS 6.2 (32-bit) (minimal install)
    • Networking: bridged adapter

    As a basic setup, I used CentOS 6.2 – since this most closely resembles the Amazon Linux that I use on my production box. The official Nginx and Varnish repositories were added and nginx and varnish were installed:

    yum localinstall http://nginx.org/packages/centos/6/noarch/RPMS/nginx-release-centos-6-0.el6.ngx.noarch.rpm
    yum localinstall --nogpgcheck http://repo.varnish-cache.org/redhat/varnish-3.0/el5/noarch/varnish-release-3.0-1.noarch.rpm
    yum install nginx varnish

    Configure Nginx

    I created a basic test site – with a single static file, and overrode the default Nginx config:

    mkdir -p /var/www/example.com/web
    echo "Testing 1, 2, 3..." > /var/www/example.com/web/test.txt
    > /etc/nginx/conf.d/default.conf

    /etc/nginx/conf.d/default.conf:

    server {
        server_name example.com www.example.com;
        listen 81 default;
        root /var/www/example.com/web;
    }

    You’ll note here that I am listening on port 81 instead of port 80 – this is because I intend for Varnish to be the front end server. After these changes, I restarted Nginx:

    service nginx start

    Verify that Nginx is running and listening on the correct port:

    ps -ef | grep nginx
    root      1502     1  0 07:09 ?        00:00:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
    nginx     1504  1502  0 07:09 ?        00:00:00 nginx: worker process
    root      1670  1203  0 07:15 pts/0    00:00:00 grep nginx
    netstat -anp | grep nginx
    tcp        0      0 0.0.0.0:81                  0.0.0.0:*                   LISTEN      1502/nginx
    unix  3      [ ]         STREAM     CONNECTED     15289  1502/nginx
    unix  3      [ ]         STREAM     CONNECTED     15288  1502/nginx

    Finally see if I can retrive the page:

    curl --header "Host: example.com" 127.0.0.1:81/test.txt
    Testing 1, 2, 3...

    So far, everything looks good – Nginx is working without issue, as expected.

    Configure Varnish

    I am going for a really minimal setup here – simply trying to illustrate the problem at hand – no optimizations at all in place. Essentially, I just change the Varnish port to 80 and use malloc instead of file backed storage:

    Edit /etc/sysconfig/varnish, and change the following:

    VARNISH_LISTEN_PORT=80
    VARNISH_STORAGE_SIZE=50M
    VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"

    Now, we need to point Varnish to the backend that is running on port 81:

    Edit /etc/varnish/default.vcl:

    backend default {
        .host = "127.0.0.1";
        .port = "81";
    }

    Finally, we start Varnish:

    service varnish start

    Confirm that it is in fact running:

    ps -ef | grep varnish
    root      1571     1  0 07:11 ?        00:00:00 /usr/sbin/varnishd -P /var/run/varnish.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s malloc,50M
    varnish   1572  1571  0 07:11 ?        00:00:00 /usr/sbin/varnishd -P /var/run/varnish.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s malloc,50M
    root      1668  1203  0 07:15 pts/0    00:00:00 grep varnish

    Verify that it is listening on the correct port:

    netstat -anp | grep varnish
    tcp        0      0 0.0.0.0:80                  0.0.0.0:*                   LISTEN      1572/varnishd
    tcp        0      0 127.0.0.1:6082              0.0.0.0:*                   LISTEN      1571/varnishd
    tcp        0      0 :::80                       :::*                        LISTEN      1572/varnishd
    unix  2      [ ]         DGRAM                    15619  1571/varnishd

    Trying to retrieve the same file from port 80 (Varnish) results in the following:

    curl --header "Host: example.com" 127.0.0.1:80/test.txt
    
    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html>
      <head>
        <title>503 Service Unavailable</title>
      </head>
      <body>
        <h1>Error 503 Service Unavailable</h1>
        <p>Service Unavailable</p>
        <h3>Guru Meditation:</h3>
        <p>XID: 1041692010</p>
        <hr>
        <p>Varnish cache server</p>
      </body>
    </html>

    At the same time, looking at the output of varnishlog displays:

       11 SessionOpen  c 127.0.0.1 41777 :80
       11 ReqStart     c 127.0.0.1 41777 1041692011
       11 RxRequest    c GET
       11 RxURL        c /test.txt
       11 RxProtocol   c HTTP/1.1
       11 RxHeader     c User-Agent: curl/7.19.7 (i686-pc-linux-gnu) libcurl/7.19.7 NSS/3.12.7.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
       11 RxHeader     c Accept: */*
       11 RxHeader     c Host: example.com
       11 VCL_call     c recv lookup
       11 VCL_call     c hash
       11 Hash         c /test.txt
       11 Hash         c example.com
       11 VCL_return   c hash
       11 VCL_call     c miss fetch
       11 FetchError   c no backend connection
       11 VCL_call     c error deliver
       11 VCL_call     c deliver deliver
       11 TxProtocol   c HTTP/1.1
       11 TxStatus     c 503
       11 TxResponse   c Service Unavailable
       11 TxHeader     c Server: Varnish
       11 TxHeader     c Content-Type: text/html; charset=utf-8
       11 TxHeader     c Retry-After: 5
       11 TxHeader     c Content-Length: 419
       11 TxHeader     c Accept-Ranges: bytes
       11 TxHeader     c Date: Sun, 11 Mar 2012 11:16:54 GMT
       11 TxHeader     c X-Varnish: 1041692011
       11 TxHeader     c Age: 0
       11 TxHeader     c Via: 1.1 varnish
       11 TxHeader     c Connection: close
       11 Length       c 419
       11 ReqEnd       c 1041692011 1331464614.671160936 1331464614.671626806 0.000131845 0.000425816 0.000040054
       11 SessionClose c error
       11 StatSess     c 127.0.0.1 41777 0 1 1 0 0 0 257 419

    Note the line:

    11 FetchError c no backend connection

    Obviously though, I have a valid backend – I can reach it, and it returns a valid page. As part of my diagostics, I tried the following:

    • Disabling iptables (service iptables stop)
    • Passing the backend directly to Varnish using the -b parameter (and removing the config file).
    • Trying a different backend
    • Restarting the VM and all servers more times than I can count
    • Rebuilding the entire VM from scratch
    • Using the external IP address for the backend instead of the loopback

    None of these had any effect or revealed anything useful – and Varnish didn’t log any errors either.

    SELinux

    For some odd reason though, I thought to try setting SELinux to permissive – and that was a success.

    If you are running auditd, the actions SELinux denies will be logged to /var/log/audit/audit.log

    Running grep varnish /var/log/audit/audit.log, shows the blocks that occured:

    type=AVC msg=audit(1331500393.450:25): avc: denied { name_connect } for pid=1276 comm="varnishd" dest=81 scontext=unconfined_u:system_r:varnishd_t:s0 tcontext=system_u:object_r:reserved_port_t:s0 tclass=tcp_socket

    The current state of SELinux enforcement can be determined with:

    cat /selinux/enforce
    1

    Setting SELinux to temporarily permissive and retesting shows that it was, indeed, the source of the problem:

    echo 0 >/selinux/enforce
    curl --header "Host: example.com" 127.0.0.1:80/test.txt
    Testing 1, 2, 3...

    Of course, this is only a temporary fix – as soon as the server is restarted, SELinux will go back to enforcing mode.

    You can setup SELinux as permanently permissive, or generate a policy from the logged denial using audit2allow.

    Install audit2allow:

    yum install policycoreutils-python

    Running audit2allow -a -w gives the following:

    type=AVC msg=audit(1331500393.450:25): avc:  denied  { name_connect } for  pid=1276 comm="varnishd" dest=81 scontext=unconfined_u:system_r:varnishd_t:s0 tcontext=system_u:object_r:reserved_port_t:s0 tclass=tcp_socket
            Was caused by:
            One of the following booleans was set incorrectly.
            Description:
            Allow varnishd to connect to all ports, not just HTTP.
    
            Allow access by executing:
            # setsebool -P varnishd_connect_any 1
    		Description:
            Allow system to run with NIS
    
            Allow access by executing:
            # setsebool -P allow_ypbind 1

    Running setsebool -P varnishd_connect_any 1, and trying to fetch the page through Varnish, shows that the problem has indeed been fixed, although SELinux is still in enforcing mode (I didn’t need to run the second command). The change persists through a server restart.

    ]]>
    http://www.thatsgeeky.com/2012/03/varnish-nothing-but-503s/feed/ 4
    Root Autologin on CentOShttp://www.thatsgeeky.com/2012/03/root-autologin-on-centos/ http://www.thatsgeeky.com/2012/03/root-autologin-on-centos/#comments Sun, 11 Mar 2012 07:18:01 +0000 cyberx86 http://www.thatsgeeky.com/?p=832 Continue reading ]]> Do not do this – it is a very, very bad idea!!! Doing this for any reason (other than the fun of it – in a ‘safe’ virtual environment) should carry with it an eternal ban prohibiting the use of a computer.

    Alright, if you are still reading, and understand that you should never, ever do this, let’s get on with breaking of some fundamental security rules of Linux.

    Firstly, to get autologin working, we will edit /etc/init/tty.conf:

    Change the line: exec /sbin/mingetty $TTY

    to: exec /sbin/mingetty --autologin root $TTY

    That’s it – restart, and you should automatically be logged in as root.

    Now, to break whatever remaining rules of security having a root autologin didn’t break, let’s also give root an empty password. To do this, we need to edit /etc/shadow. Firstly, though – the file is read-only – even to root (which is a pretty good indicator that you shouldn’t be touching it).

    Make the file writable: chmod u+w /etc/shadow

    Fields in /etc/shadow are colon delimited. The first field is the username, the second field is the password. To get an empty password, remove everything between the first and second colons for the user root so you have something like:

    root::15410:0:99999:7:::

    Save the file and revert the permissions (chmod u-w /etc/shadow).

    If you try to login to this server via SSH, you will find that you cannot. By default, SSH requires a non-empty password, to fix that, edit /etc/ssh/sshd_config and add/uncomment the line:

    PermitEmptyPasswords yes

    You also need PermitRootLogin yes, however it is the default on CentOS 6.

    At this point you should be able to login to your server via SSH, as root, with no password. Essentially, you only need to know the IP address of your server (and the SSH port) to get in as root.

    Recap

    For CentOS 6.2, the following steps accomplish all of the above:

    sed -i -e 's/exec \/sbin\/mingetty $TTY/exec \/sbin\/mingetty --autologin root $TTY/g' /etc/init/tty.conf
    chmod u+w /etc/shadow
    sed -i -e 's/^root:[^:]*:/root::/g' /etc/shadow
    chmod u-w /etc/shadow
    sed -i -e 's/#PermitEmptyPasswords .*/PermitEmptyPasswords yes/g' /etc/ssh/sshd_config

    Just in case you didn’t get the message at the start – if you find yourself needing to do this, something is very wrong – so just don’t do it.

    ]]>
    http://www.thatsgeeky.com/2012/03/root-autologin-on-centos/feed/ 2
    Windowless VirtualBox VMs (Windows Host)http://www.thatsgeeky.com/2012/03/windowless-virtualbox-vms-windows-host/ http://www.thatsgeeky.com/2012/03/windowless-virtualbox-vms-windows-host/#comments Sun, 11 Mar 2012 05:46:37 +0000 cyberx86 http://www.thatsgeeky.com/?p=825 Continue reading ]]> Since I haven’t gotten around to playing with VMware or xen yet – VirtualBox is what I am using for virtualization in my test environment.

    Under Windows I like to use PuTTy to connect to my VMs – even those running on the same machine. Other than providing a consistent interface, it has a few features that make life easier (cut & paste, scrollback, and my desired colour scheme being the notable ones). Given this, I really don’t need the VirtualBox GUI to launch and sit around doing nothing – I have enough necessary windows open at any time to not want an extra unnecessary one.

    So, the starting point, of course, is to get a list of your VMs:

    c:\Program Files\Oracle\VirtualBox> vboxmanage list vms
    
    "CentOS" {1aa2fa18-fc25-4610-91ca-6e984b33edd2}

    I’ve shown one VM above as an example – the output includes both a name and a UUID. If you look at the other programs in that directory, you will find ‘VBoxHeadless‘ – as the name suggests, it will run a VM without a GUI – sounds like what we want. We can pass the VM to it using the name, for example:

    c:\Program Files\Oracle\VirtualBox\VBoxHeadless -s CentOS

    Or using the UUID, for example:

    c:\Program Files\Oracle\VirtualBox\VBoxHeadless -s 1aa2fa18-fc25-4610-91ca-6e984b33edd2

    The ‘-s‘ parameter is short for ‘--start-vms‘. Giving it a try, you will find that it starts the VM and doesn’t show the GUI – however, the command prompt sticks around, which rather defeats the point of trying to do away with that extra window lying around (and closing out that command prompt aborts the VM).

    A bit of research shows some interesting options, which I will briefly outline below – I’ll leave it up to you to decide which one is the best.

    Method 1 – VBScript

    Write a VBScript to launch VBoxHeadless so that you can alter the window settings – it works, but seems overly complex for the task, without being particularly interesting. Easy enough to come across elsewhere, so I won’t make any further mention of it here.

    Method 2 – Modify the VBoxHeadless binary

    This involves a hex editor and actually changing the bytes of a program – definitely more ‘hackish’ than using VBScript – but also more interesting.

    Most Windows applications usually retain the (DOS) MZ header – literally, they start with MZ (ASCII) (4D 5A in hex) – so they can display the “This program cannot be run in DOS mode” error. Such programs define the offset to the PE header at byte 60 (in decimal). Looking at position 60 of VBoxHeadless we get the hex value F8 (which is 248 in decimal). As you might guess the PE header starts with… ‘PE‘ (in ASCII) – which is 50 54 in hex. Sure enough, going to position 248 of our binary shows us the expected start of the PE header.

    The structure of the PE header is described in this MSDN article (it does make for an interesting read if you are into that sort of thing). The particular header field we are interested in – ‘Subsystem’ – begins on the 93rd byte of the header (add up all the BYTE, WORD (2 bytes each), and DWORD (4 bytes each) values before it to get 88, plus 4 bytes for the PE header, giving a total of 92). Starting from position 248, this takes us to 340. This field is 2 bytes in length (WORD), and is set to 03 00 (0×0003) in the original VBoxHeadless. It defines the subsystem that the application will run on – changing this from ‘03 00‘ (character subsystem) to ‘02 00‘ (GUI subsystem), allows the program to run without the console window. (So, just to restate that – change byte 340 of VBoxHeadless from ’03′ to ’02′). (I am using VBoxHeadless v4.1.8r75467 at the time of writing)

    This method does work quite well – (obviously, you should backup your original executable before trying it) – and while easy enough, seems both overly complex for the purpose and will probably break with each upgrade.

    Method 3 – The start command

    The ‘start‘ command will launch a program, and it offers the /B parameter to do so without creating a new window. This one is rather interesting, in that by itself, the start command seems to still leave a command prompt window lying around. At first glance, it seemed pretty much identical to using VBoxHeadless directly – however, I was able to exit the running program (using Ctrl+C) without it terminating (on Windows 7 SP1). On the other hand, I have read quite a few comments suggesting that trying to exit it will terminate the program – so your mileage may vary. Obviously, if the program terminates it defeats the point – and at the same time, keeping a useless command prompt around isn’t ideal. Redirecting the output of start to nul got me back to the command prompt (trying to redirect the output of VBoxHeadless directly to nul, on the other hand, didn’t return me to the command prompt – just left me with a blinking cursor).

    start /D "C:\program files\oracle\virtualbox" /B VBoxHeadless -s UUID > nul 2&>1

    For example:

    start /D "C:\program files\oracle\virtualbox" /B VBoxHeadless -s 1aa2fa18-fc25-4610-91ca-6e984b33edd2 > nul 2>&1

    Sadly though, while you are returned to the command prompt, and everything seems great – exiting the command prompt aborts the VM – so this one doesn’t actually work.

    Method 4 – PowerShell’s start-process

    The final approach which I rather liked was using PowerShell. Now, I am not someone who has any interest in Windows sysadmin stuff – so using PowerShell isn’t exactly part of my skill set, but if something performs the task I need in a reasonble manner, I am open to using it. Powershell has a ‘start-process‘ command to which you can directly pass a WindowStyle parameter. In many ways this is very similar to what is being done in most VBScript approaches to this problem, I just find this to be a more elegant solution. Running it from the command prompt, you get:

    powershell start-process 'C:\program files\oracle\virtualbox\vboxheadless' '-s 1aa2fa18-fc25-4610-91ca-6e984b33edd2' -WindowStyle Hidden

    Of particular note here, is that:

    1. You must provide the parameters to VBoxHeadless as a separate parameter to start-process and
    2. Using double quotes around the parameters passed to start-process does not work (when done in this manner – it works just fine if you first launch PowerShell – but that is an extra step).

    With that annoyance out of the way, perhaps I can now actually use my VM for what it was intended…

    ]]>
    http://www.thatsgeeky.com/2012/03/windowless-virtualbox-vms-windows-host/feed/ 2
    Directly connecting to PHP-FPMhttp://www.thatsgeeky.com/2012/02/directly-connecting-to-php-fpm/ http://www.thatsgeeky.com/2012/02/directly-connecting-to-php-fpm/#comments Wed, 01 Feb 2012 22:01:53 +0000 cyberx86 http://www.thatsgeeky.com/?p=809 Continue reading ]]> When it comes to troubleshooting, it is ideal to be able to isolate each component of a system. In the case where multiple connected items are performing correctly, they can sometimes be grouped together – however, if one of these items is not functioning, diagnostics become much harder.

    My typical web server stack includes:

    • Varnish as a caching proxy
    • Nginx as a web server
    • PHP-FPM for PHP via FastCGI

    In the above setup, Varnish and nginx run on different ports, making it fairly easy to bypass Varnish and query nginx directly, however, it isn’t quite as easy to query PHP-FPM directly. Without being able to do so, it can be difficult to determine if a misconfiguration is on the nginx side or the PHP-FPM side.

    Talking to your FastCGI server

    (Some commands below are specific to CentOS/RHEL, but the basic ideas should be generally applicable)

    Unlike many other services (e.g. SMTP servers), it is not possible to connect to a FastCGI server using telnet – the communication protocol is not plaintext.

    Luckily though, one of the tools included with FastCGI is cgi-fcgi – a bridge from CGI to FastCGI, which we can use to query a FastCGI server directly, bypassing our web server.

    On RHEL/CentOS, cgi-fcgi is included in the fcgi package, and can be installed with:

    yum --enablerepo=epel install fcgi

    The package is fairly small (84 kb installed) and is available from EPEL (currently v2.4.0-10.el6). cgi-fcgi installs to /usr/bin

    A notable difference between FastCGI and most other servers is that parameters are set as environment variables. As such, the necessary environment variables must be set before cgi-fcgi is called if a successful page is to be returned.

    To connect to a FastCGI server that is already running, we pass the --bind and --connect parameters, with the socket path (or address and port). For instance:

    cgi-fcgi -bind -connect 127.0.0.1:9010

    In its simplest form, we need to pass only the SCRIPT_FILENAME, SCRIPT_NAME, and REQUEST_METHOD to the application. More complex setups will require additional variables, especially DOCUMENT_ROOT and QUERY_STRING.

    PHP-FPM can be configured to respond to pings – that is, it will serve a predefined response every time a particular path is queried. Pings must be enabled on a per-pool basis, from the php-fpm config, by adding (or uncommenting) the following line (you can change the path as desired):

    ping.path = /ping

    (The response can be set with ping.response = response – but will default to ‘pong’ if omitted).

    For the changes to the config to be picked up, you must reload php-fpm:

    service php-fpm reload

    You can test this by running the following (again, modify the connect line to reference the socket on which PHP-FPM listens)

    SCRIPT_NAME=/ping\
    SCRIPT_FILENAME=/ping\
    REQUEST_METHOD=GET \
    cgi-fcgi -bind -connect 127.0.0.1:PORT

    A typical response may be:

    X-Powered-By: PHP/5.3.9
    Content-Type: text/plain
    Expires: Thu, 01 Jan 1970 00:00:00 GMT
    Cache-Control: no-cache, no-store, must-revalidate, max-age=0
    
    pong

    The above can be modified to return a PHP page, by passing the correct SCRIPT_FILENAME, SCRIPT_NAME, QUERY_STRING, and DOCUMENT_ROOT.

    Retrieving the PHP-FPM status page

    The status page seems to be a minimally documented feature of PHP-FPM. It is a brief server-generated page (TEXT, JSON, HTML, or XML) which provides basic information about a pool. To enable it, add (or uncomment) the following in your php-fpm config, on a per-pool basis:

    pm.status_path = /status

    As with ping, you can modify the path that will return the status page. The path must begin with a slash (of course, it is good practice to not include a ‘php’ extension on it, as that would be needlessly confusing).

    Note: since the status page will offer different formats depending on the query string, you must set the QUERY_STRING variable, or the page will not work. Valid query strings are json, xml, html – all other query strings will return a plain text version.

    For example, using the default settings, the following will return plain text statistics (again, modify the connect line to reference the socket on which PHP-FPM listens):

    SCRIPT_NAME=/status \
    SCRIPT_FILENAME=/status \
    QUERY_STRING= \
    REQUEST_METHOD=GET \
    cgi-fcgi -bind -connect 127.0.0.1:PORT

    Sample output:

    X-Powered-By: PHP/5.3.9
    Expires: Thu, 01 Jan 1970 00:00:00 GMT
    Cache-Control: no-cache, no-store, must-revalidate, max-age=0
    Content-Type: text/plain
    
    pool: web1
    process manager: dynamic
    start time: 01/Feb/2012:20:49:44 -0500
    start since: 1214
    accepted conn: 5
    listen queue: 0
    max listen queue: 0
    listen queue len: 128
    idle processes: 1
    active processes: 1
    total processes: 2
    max active processes: 1
    max children reached: 0

    For interest sake, I threw together a simple Perl script that could accept a path as an argument, set the basic variables, and call cgi-fcgi. Note, it does not set the DOCUMENT_ROOT, which may be needed for more complex scripts.

    fcgi.pl:

    #!/usr/bin/perl -w
    use strict;
     
    my $fpm = $ARGV[0];
    my $url = $ARGV[1];
     
    if (!defined $fpm || !defined $url ){
        print "Usage: $0 host:port|path/to/socket /path/to/file \n";
        exit 1;
    }
     
    if($url =~ /^((?:\/.*)?(\/[^?]*))(?:\?(.*))?$/) {
        $ENV{REQUEST_METHOD}='GET';
        $ENV{SCRIPT_FILENAME}= $1;
        $ENV{SCRIPT_NAME}= $2;
        $ENV{QUERY_STRING}= $3 // '';
    }
     
    system ('cgi-fcgi', '-bind', '-connect', $fpm);

    To request the status page in JSON format, using the above, you would run:

    ./fcgi.pl 127.0.0.1:9010 /status?json

    (There is a space between the socket and the path – they are two separate parameters)

    The PHP-FPM Status Page

    The best documentation of the status page that I have come across is in the php-fpm config file. As such, it takes a bit more effort than desirable to discern the meaning of some of the values. Here is my current understanding of them (PHP v5.3.9):

    • pool – the name of the pool that is listening on the connected socket, as defined in the php-fpm config.
    • process manager – the method used by the process manager to control the number of child processes – either dynamic or static – set on a per pool basis (in the php-fpm config) by the pm parameter.
    • start time – the date, time, and UTC offset corresponding to when the PHP-FPM server was started.
    • start since – the number of seconds that have elapsed since the PHP-FPM server was started (i.e. uptime).
    • accepted conn – the number of incoming requests that the PHP-FPM server has accepted; when a connection is accepted it is removed from the listen queue (displayed in real time).
    • listen queue – the current number of connections that have been initiated, but not yet accepted. If this value is non-zero it typically means that all the available server processes are currently busy, and there are no processes available to serve the next request. Raising pm.max_children (provided the server can handle it) should help keep this number low. This property follows from the fact that PHP-FPM listens via a socket (TCP or file based), and thus inherits some of the characteristics of sockets.
    • max listen queue – the maximum value the listen queue has reached since the server was started.
    • listen queue len – the upper limit on the number of connections that will be queued Once this limit is reached, subsequent connections will either be refused, or ignored. This value is set by the php-fpm per pool configuration option ‘listen.backlog‘, which defaults to -1 (unlimited). However, this value is also limited by the system (sysctl) value ‘net.core.somaxconn‘, which defaults to 128 on many Linux systems.
    • idle processes – the number of servers in the ‘waiting to process’ state (i.e. not currently serving a page). This value should fall between the pm.min_spare_servers and pm.max_spare_servers values when the process manager is dynamic. (updated once per second)
    • active processes – the number of servers current processing a page – the minimum is 1 (so even on a fully idle server, the result will be not read 0). (updated once per second)
    • total processes – the total number of server processes currently running; the sum of idle processes + active processes. If the process manager is static, this number will match pm.max_children. (updated once per second)
    • max active processes – the highest value that ‘active processes’ has reached since the php-fpm server started. This value should not exceed pm.max_children.
    • max children reached – the number of times that pm.max_children has been reached since the php-fpm server started (only applicable if the process manager is dynamic)
    ]]>
    http://www.thatsgeeky.com/2012/02/directly-connecting-to-php-fpm/feed/ 6
    Combating WordPress Spamhttp://www.thatsgeeky.com/2012/01/combating-wordpress-spam/ http://www.thatsgeeky.com/2012/01/combating-wordpress-spam/#comments Tue, 24 Jan 2012 16:51:31 +0000 cyberx86 http://www.thatsgeeky.com/?p=800 Continue reading ]]> This post is rather different than my normal style – but it is something that has been driving me increasingly crazy of late. This site started in late 2010, and by early 2011 was receiving about 1 spam comment a day (30 a month). From the very start, I have used Akismet to filter spam, but have also looked through all the spam comments to see if there were any false positives. While Akismet has so far had an exceptional success rate (just about perfect I think – no false positives or missed spam), I still feel it is important to take the time to perform such a check.

    Recently, however – in the last quarter of 2011 – my spam comment volume has increased to the point of this no longer being possible. While certainly not on the level of any large site, I currently get about 50 spam comments a day – over 1500 per month.

    For quite some time I have wanted a plugin that would notify the commenter if Akismet flagged a comment as spam, and allow them to modify it (or, if not modified, just reject it). This would provide some transparency for potential commenters, and so limit the effect of potential false positives.

    A common suggestion is to use a CAPTCHA – however, it seems that these days CAPTCHAs are sufficiently hard to decipher so as to turn off most casual commenters. I really am not looking to inconvenience humans – just computers.

    The perfect solution to my problem was the Conditional CAPTCHA plugin – it doesn’t display a CAPTCHA unless Akismet flags a comment as spam – this avoids inconveniencing legitimate commenters, and provides a notification (and a second chance) to anyone unfortunate enough to have their comment erroneously flagged as spam.

    Having tested it for a few days – I am now spam free, and without having the lingering doubt that legitimate comments are not making their way through. My only wish is that it was a bit easier to find this combination of plugins.

    ]]>
    http://www.thatsgeeky.com/2012/01/combating-wordpress-spam/feed/ 0
    Autoscaling with custom metricshttp://www.thatsgeeky.com/2012/01/autoscaling-with-custom-metrics/ http://www.thatsgeeky.com/2012/01/autoscaling-with-custom-metrics/#comments Mon, 23 Jan 2012 05:22:32 +0000 cyberx86 http://www.thatsgeeky.com/?p=791 Continue reading ]]> One of the appeals of cloud computing is the idea of using what you need when you need. One of the ways that Amazon provides for this is through autoscaling. In essence, this allows you to vary the number of (related) running instances according to some metric that is being tracked.

    In this article, we look at how you can trigger a change in the number of running instances using a custom Cloudwatch metric – including the setup of said metric, and a brief look at the interactions between the various autoscaling commands used.

    Setting up a custom Cloudwatch metric

    Autoscaling uses Cloudwatch alarms to trigger events, in order to use it, we therefore need a functioning Cloudwatch metric and alarm.

    Cloudwatch’s basic monitoring is in 5 minute increments and measures a number of parameters that are independent of the operating system and user data, including CPU utilization, disk I/O (in operations and bytes) and network usage. Additional metrics are made available for other services (e.g. EBS volumes, SNS, etc.)

    Cloudwatch metrics do not have to be precreated, nor is it necessary to allocate space for them. It is worth mentioning that you cannot delete a metric – any data saved is retained for 2 weeks. Metrics are automatically created when data is added to them.

    Custom metrics are created using the ‘PutMetricData’ request. This is available as one of the CLI tools for AWS, mon-put-data:

    mon-put-data
        --metric-name value --namespace value [--dimensions
        "key1=value1,key2=value2..." ] [--timestamp value ] [--unit value ]
        [--value value ] [--statisticValues "SampleCount=value, Sum=value,
        Maximum=value, Minimum=value" ] [General Options]

    Note: Metrics differing in any name, namespace, or dimensions (case sensitive) are classified as different metrics.

    Pass the command a metric name, namespace, credentials, and a value and you are good to go. It might take a few minutes for the results to initially show up on Cloudwatch, but from my experience it is usually only a few seconds.

    It is worth noting, here, that the AWS command line tools are Java based, and have to load Java before they can do anything, which makes them quite slow. None the less, they are easy to use, and a good starting point (we’ll look at an alternate approach later).

    I’ll use the example of used memory throughout this post, since it is a fairly common use case (and it can be easily adapted for other metrics).

    Amazon has posted a bash script to get us started, modified slightly below:

    #!/bin/bash
     
    export AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
    export EC2_PRIVATE_KEY=/path/to/pk-XXXXXXXXXXXXXXXXXXXXXXXXXX.pem
    export EC2_CERT=/path/to/cert-XXXXXXXXXXXXXXXXXXXXXXXXXX.pem
    export AWS_CLOUDWATCH_URL=https://monitoring.amazonaws.com
    export PATH=$AWS_CLOUDWATCH_HOME/bin:$PATH
    export JAVA_HOME=/usr/lib/jvm/jre
     
    # get ec2 instance id
    instanceid=`/usr/bin/curl -s http://169.254.169.254/latest/meta-data/instance-id`
     
    memtotal=`free -m | grep 'Mem' | tr -s ' ' | cut -d ' ' -f 2`
    memfree=`free -m | grep 'buffers/cache' | tr -s ' ' | cut -d ' ' -f 4`
    let "memused=100-memfree*100/memtotal"
     
    mon-put-data --metric-name "UsedMemoryPercent" --namespace "System/Linux" --dimensions "InstanceID=$instanceid" --value "$memused" --unit "Percent"

    The script takes the number from the ‘-/+ buffers/cache‘ row under the ‘free‘ column, as a percent of ‘total‘ (under the ‘Mem‘ row), and sets up one metric (UsedMemoryPercent), in the namespace ‘System/Linux’, with a single dimension (InstanceID).

    Notes:

    • AWS_CLOUDWATCH_HOME/bin contains the cloudwatch command line tools
    • The paths I have used, above, are for Amazon’s Linux AMI
    • As a personal preference, I have used curl instead of wget.
    • It should also be mentioned that the bash math used above will only yield integer results.

    To use the script, make it executable:

    chmod +x /path/to/script.sh

    Set it up to run every 5 minutes with crontab -e

    */5 * * * * /path/to/script.sh

    The project ‘aws-missing-tools’, hosted on Google Code has a few more scripts, similar to the one above, for gathering other metrics.

    Due to the poor performance of the AWS CLI tools, it is far more efficient to call the API directly. This can be accomplished using any of the available SDKs (e.g. PHP, Ruby, etc.). However, even an SDK seems to be overkill for one command. I came across a simple python script, from Loggly, that signs the passed parameters, and can easily be setup to put the Cloudwatch metrics. I have modified it to be a single script, and accept a value on as a command line argument:

    cloudfront-mem.py:

    import httplib2, sys, os, base64, hashlib, hmac, time
    import json as simplejson
    from urllib import urlencode, quote_plus
     
    aws_key = 'AWS_ACCESS_KEY_ID'
    aws_secret_key = 'AWS_SECRET_ACCESS_KEY_ID'
     
    value = sys.argv[1]
    instanceid = sys.argv[2]
     
    params = {'Namespace': 'System/Linux',
     'MetricData.member.1.MetricName': 'UsedMemoryPercent',
     'MetricData.member.1.Value': value,
     'MetricData.member.1.Unit': 'Percent',
     'MetricData.member.1.Dimensions.member.1.Name': 'InstanceID',
     'MetricData.member.1.Dimensions.member.1.Value': instanceid}
     
    def getSignedURL(key, secret_key, action, parms):
     
        # base url
        base_url = "monitoring.amazonaws.com"
     
        # build the parameter dictionary
        url_params = parms
        url_params['AWSAccessKeyId'] = key
        url_params['Action'] = action
        url_params['SignatureMethod'] = 'HmacSHA256'
        url_params['SignatureVersion'] = '2'
        url_params['Version'] = '2010-08-01'
        url_params['Timestamp'] = time.strftime("%Y-%m-%dT%H:%M:%S.000Z", time.gmtime())
     
        # sort and encode the parameters
        keys = url_params.keys()
        keys.sort()
        values = map(url_params.get, keys)
        url_string = urlencode(zip(keys,values))
     
        # sign, encode and quote the entire request string
        string_to_sign = "GET\n%s\n/\n%s" % (base_url, url_string)
        signature = hmac.new( key=secret_key, msg=string_to_sign, digestmod=hashlib.sha256).digest()
        signature = base64.encodestring(signature).strip()
        urlencoded_signature = quote_plus(signature)
        url_string += "&Signature=%s" % urlencoded_signature
     
        # do it
        foo = "http://%s/?%s" % (base_url, url_string)
        return foo
     
    class Cloudwatch:
        def __init__(self, key, secret_key):
            self.key = os.getenv('AWS_ACCESS_KEY_ID', key)
            self.secret_key = os.getenv('AWS_SECRET_ACCESS_KEY_ID', secret_key)
     
        def putData(self, params):
            signedURL = getSignedURL(self.key, self.secret_key, 'PutMetricData', params)
            h = httplib2.Http()
            resp, content = h.request(signedURL)
            #print resp
            #print content
     
    cw = Cloudwatch(aws_key, aws_secret_key)
    cw.putData(params)

    The above script has one dependency (httplib2) which wasn’t included in my default Python installation, and can be added with:

    yum --enablerepo=epel install python-httplib2

    I use the following bash script, from cron, to gather the data and call the python script.

    #!/bin/bash
     
    instanceid=`/usr/bin/curl -s http://169.254.169.254/latest/meta-data/instance-id`
     
    memtotal=`/usr/bin/less /proc/meminfo | /bin/grep -i ^MemTotal: | /bin/grep -o [0-9]*`
    memfree=`/usr/bin/less /proc/meminfo | /bin/grep -i ^MemFree: | /bin/grep -o [0-9]*`
    buffers=`/usr/bin/less /proc/meminfo | /bin/grep -i ^Buffers: | /bin/grep -o [0-9]*`
    cached=`/usr/bin/less /proc/meminfo | /bin/grep -i ^Cached: | /bin/grep -o [0-9]*`
     
    let "memusedpct=100-(memfree+buffers+cached)*100/memtotal"
     
    /usr/bin/python /path/to/cloudfront-mem.py $memusedpct $instanceid

    The absolute paths are used to avoid errors with cron. Of course, the above scripts have no real error checking in them – but they do serve my purposes quite well.

    Hopefully, once you are up and running, you can see something like the following, in CloudWatch:

    Setting up Autoscaling

    The setup of autoscaling is the same for custom metrics or existing instance-metrics.

    For ease of use (i.e. so we don’t have to pass them to every command), we should set (export) either:

    • AWS_CREDENTIAL_FILE or
    • both: EC2_PRIVATE_KEY and EC2_CERT

    The CLI tools for autoscaling are sufficient for our needs, since we only have to run them once – from the command line – and not multiple times.

    Create the launch config

    This step sets up the EC2 instance to launch – therefore, it resembles the call to ec2-run-instances. As with the run command, you must pass an AMI and instance type, but can also specify additional parameters such as a block device mapping, security group, or user data.

    as-create-launch-config
        LaunchConfigurationName --image-id value --instance-type value
        [--block-device-mapping "key1=value1,key2=value2..." ]
        [--monitoring-enabled/monitoring-disabled ] [--kernel value ] [--key
        value ] [--ramdisk value ] [--group value[,value...] ] [--user-data
        value ] [--user-data-file value ] [General Options]

    For example, to create a launch config called ‘geek-config’ which will launch an m1.small instance based on the 32-bit Amazon’s Linux AMI (ami-31814f58), using the keypair ‘geek-key’ into the security group, ‘geek-group’, we would use the following:

    as-create-launch-config geek-config --image-id ami-31814f58 --instance-type m1.small --key geek-key --group geek-group

    Note: it is acceptable to use the security group name unless launching into VPC, in which case you must use the security group id.

    Create the autoscaling group

    Here we define the parameters for scaling – for instance, the availability zone(s) to launch into and the (upper and lower) limits on the number of instances and we associate the group with the launch configuration we created previously. This command also gives us a chance to setup loadbalancers if needed, to specify a freeze time on scaling (i.e. while the group size is being adjusted), and to start with a number of instances other than the minimum value.

    as-create-auto-scaling-group
        AutoScalingGroupName --availability-zones value[,value...]
        --launch-configuration value --max-size value --min-size value
        [--default-cooldown value ] [--desired-capacity value ]
        [--grace-period value ] [--health-check-type value ] [--load-balancers
        value[,value...] ] [--placement-group value ] [--vpc-zone-identifier
        value ] [General Options]

    For example, to have a group (‘scaly-geek’) start with 2 instances and scale between 1 and 5 instances based on the above launch config, all of them in the us-east-1a region, with a 3 minute freeze on scaling, we would use:

    as-create-auto-scaling-group scaly-geek --availability-zones us-east-1a --launch-configuration geek-config --min-size 1 --max-size 5 --cooldown 180 --desired-capacity 2

    Note: if you do not specify a –desired-capacity then the –min-size number of instances will be used)

    Create a policy to scale with

    This command allows us to define a new capacity – either via a change (numerical or percent) or by specifying an exact number of instances, and associates itself with the scaling group we have created previously. Negative numbers are used to represent a decrease in the number of instances. This policy will be referenced by its Amazon Resource Name (ARN) and used as the action of a Cloudwatch alarm. We can create multiple policies depending on our needs, but at least two policies – one to scale up and one to scale down – are common.

    as-put-scaling-policy
        PolicyName --type value --auto-scaling-group value --adjustment
        value [--cooldown value ] [General Options]

    Acceptable values for –type are: ExactCapacity, ChangeInCapacity, and PercentChangeInCapacity; the --cooldown value specified here will override the one specified in our scaling group, above.

    ARNs refer to resources across all the AWS products and take the form:

    arn:aws:<vendor>:<region>:<namespace>:<relative-id>

    Where:

    • vendor identifies the AWS product (e.g., sns)
    • region is the AWS Region the resource resides in (e.g., us-east-1), if any
    • namespace is the AWS account ID with no hyphens (e.g., 123456789012)
    • relative-id is the service specific portion that identifies the specific resource

    Not all fields are required by every resource

    Sample Output:

    POLICY-ARN arn:aws:autoscaling:us-east-1:0123456789:scalingPolicy/abc-1234-def-567

    It is important to note down the ARN returned by this command as it will be needed in order to associate the policy with a cloudwatch alarm. Each time you run the command you will get a unique ARN.

    For example, to create a (scale-up) policy (LowMemoryPolicy), based on our scaling group from above, where we want to add one instance, we would use:

    as-put-scaling-policy LowMemPolicy --auto-scaling-group scaly-geek --adjustment=1 --type ChangeInCapacity

    To do the same, but scale down (by one instance) instead, we would use:

    as-put-scaling-policy HighMemPolicy --auto-scaling-group scaly-geek --adjustment=-1 --type ChangeInCapacity

    Note: you can test your policy by using the command as-execute-policy:

    as-execute-policy
        PolicyName [--auto-scaling-group  value ]
        [--honor-cooldown/no-honor-cooldown  ]  [General Options]

    Create Cloudwatch Alarms

    This is the final step – tying everything together. We have collected data in Cloudwatch, and we will can setup an alarm to be triggered when our metric breaches the target value. This alarm will then be setup to perform one or more actions, specified by their ARN(s). In our case, the alarm will trigger a scaling policy – which will then change the number of instances in our scaling group.

    As with our scaling policies, there can be multiple alarms – in our case, two – one to define the lower bound (and trigger our scale-down policy) and one to define the upper bound (and trigger our scale-up policy).

    mon-put-metric-alarm
        AlarmName --comparison-operator value --evaluation-periods value
        --metric-name value --namespace value --period value --statistic
        value --threshold value [--actions-enabled value ] [--alarm-actions
        value[,value...] ] [--alarm-description value ] [--dimensions
        "key1=value1,key2=value2..." ] [--insufficient-data-actions
        value[,value...] ] [--ok-actions value[,value...] ] [--unit value ]
        [General Options]

    Notes:

    • --metric-name and --namespace must match those used to create the original Cloudfront metric.
    • --period and --evaluation-periods are both required. The former defines the length of one period in seconds, and the latter defines the number (integer) of consecutive periods that much match the criteria to trigger the alarm.
    • Valid values for --comparison-operator are: GreaterThanOrEqualToThreshold, GreaterThanThreshold, LessThanThreshold, and LessThanOrEqualToThreshold
    • Valid values for --statistic are: SampleCount, Average, Sum, Minimum, Maximum

    For example, to create an alarm which will trigger a scaling policy whenever our UsedMemoryPercent averages over 85% for 2 consecutive 5 minute periods we would use:

    mon-put-metric-alarm LowMemAlarm --comparison-operator GreaterThanThreshold --evaluation-periods 2 --metric-name UsedMemoryPercent --namespace "System/Linux" --period 300 --statistic Average --threshold 85 --alarm-actions arn:aws:autoscaling:us-east-1:0123456789:scalingPolicy/abc-1234-def-567 --dimensions "AutoScalingGroupName=scaly-geek"

    For the alarm which will trigger our scale-down policy once our UsedMemoryPercent averages below 60% for 2 consecutive 5 minute periods we would use:

    mon-put-metric-alarm HighMemAlarm --comparison-operator LessThanThreshold --evaluation-periods 2 --metric-name UsedMemoryPercent --namespace "System/Linux" --period 300 --statistic Average --threshold 60 --alarm-actions arn:aws:autoscaling:us-east-1:0123456789:scalingPolicy/bcd-2345-efg-678 --dimensions "AutoScalingGroupName=scaly-geek"

    Just to clarify, the aggregation (--statistic) is performed over a single period – and its result must compare (--comparison-operator) to the threshold for the specified number of consecutive periods (--evaluation-periods).

    Two following two AWS CLI commands are helpful in debugging errors with autoscaling:

    mon-describe-alarm-history
        [AlarmName] [--end-date  value ] [--history-item-type  value ]
        [--start-date  value ]  [General Options]
    as-describe-scaling-activities
        [ActivityIds [ActivityIds ...] ] [--auto-scaling-group  value ]
        [--max-records  value ]  [General Options]

    Notes

    • The use of as-create-or-update-trigger is deprecated and should be avoided.
    • You can list your existing policies using as-describe-policies and delete policies with as-delete-policy
    • Alarms will display as red lines on your Cloudwatch graphs.
    • The namespace, metric-name, and dimensions must match exactly

    References

    Run the CLI commands with the --help parameter to see the details of the available options.

    ]]>
    http://www.thatsgeeky.com/2012/01/autoscaling-with-custom-metrics/feed/ 3