WordPress – Automatically publish a pending post each weekday morning from a PHP script

In an earlier post, Piwik Web Analytics, and some unfortunate stats about my blog, I mentioned that the Feedburner stats for this blog show a relatively high subscribe/unsubscribe rate for this blog. I think a large part of that is my tendency to blog in spurts, and even worse, my tendency to write drafts and not publish them. In an effort to combat this, I’ve been trying to finish blog posts and then set them to “Pending” status, and go back and publish one every day (well, every day that I have some still sitting unpublished). Of course, that counts on me logging in to WordPress every day, which isn’t something I do. The following script is, at least for now, the answer for me.

This script (a standalone PHP script) uses wp-load.php to load the wordpress environment, and then finds the oldest post with a given status (“pending” in my case) and attempts to publish it. It only does this if there has not been another post published in the last 24 hours. The following script can be found in subversion at http://svn.jasonantman.com/misc-scripts/wordpress_daily_post.php:

#!/usr/bin/php
<?php
/**
 * wordpress_daily_post.php
 * Script to publish the oldest post with a given status, if no
 * other post has been published in 24 hours. Intended to be run
 * via cron on weekdays.
 *
 * Copyright 2012 Jason Antman 
 *
 * Licensed under the Apache License, Version 2.0 
 *
 * use it anywhere you want, however you want, provided that this header is left intact,
 * and that if redistributed, credit is given to me.
 *
 * It is strongly requested, but not technically required, that any changes/improvements
 * be emailed to the above address.
 *
 * The latest version of this script will always be available at:
 * $HeadURL: http://svn.jasonantman.com/misc-scripts/wordpress_daily_post.php $
 * $LastChangedRevision: 40 $
 *
 * Changelog:
 * 2012-09-03 Jason Antman  - 1.0
 *  - first version
 */
 
# BEGIN CONFIGURATION
define('WP_LOAD_LOC', '/var/www/vhosts/blog.jasonantman.com/wp-load.php'); // Configure this to the full path of your Wordpress wp-load.php
define('SOURCE_POST_STATUS', 'pending'); // post status to publish
# END CONFIGURATION

$VERBOSE = false;
$DRY_RUN = false;
array_shift($argv);
while(count($argv) > 0) {
  if(isset($argv[0]) && $argv[0] == "-d" || $argv[0] == "--dry-run"){
    $DRY_RUN = true;
    fwrite(STDERR, "DRY RUN ONLY - NOT ACTUALLY PUBLISHING.\n");
  }
  if(isset($argv[0]) && $argv[0] == "-v" || $argv[0] == "--verbose"){
    $VERBOSE = true;
    fwrite(STDERR, "WP_LOAD_LOC=".WP_LOAD_LOC."\n");
    fwrite(STDERR, "SOURCE_POST_STATUS=".SOURCE_POST_STATUS."\n");
  }
  array_shift($argv);
}
 
$_SERVER['HTTP_HOST'] = 'localhost'; // needed for wp-includes/ms-settings.php:100
require_once(WP_LOAD_LOC);
 
# check that we're running on a weekday
if(date('N') >= 6) {
#  if($VERBOSE){ fwrite(STDERR, "today is a saturday or sunday, dieing.\n"); }
#  exit(1);
}
 
# find the publish date/time of the last published post
$published = get_posts(array('numberposts' => 1, 'orderby' => 'post_date', 'order' => 'DESC', 'post_status' => 'publish'));
$post = $published[0];
$pub_date = $post->post_date;
$pub_id = $post->ID;
 
if(strtotime($pub_date) >= (time() - 86400)) {
  if($VERBOSE){ fwrite(STDERR, "last post (ID $pub_id) within last day ($pub_date). Nothing to do. Exiting.\n"); }
  exit(0);
} else {
  if($VERBOSE){ fwrite(STDERR, "Found last post (ID $pub_id) with post date $pub_date.\n"); }
}
 
 
# find the earliest post of status SOURCE_POST_STATUS, if there is one.
$to_post = get_posts(array('numberposts' => 1, 'orderby' => 'post_date', 'order' => 'ASC', 'post_status' => SOURCE_POST_STATUS));
if(count($to_post) ID;
$to_pub_date = $post->post_date;
$to_pub_title = $post->post_title;
$now = time();
$new_date = date("Y-m-d H:i:s", $now);
$new_date_gmt = gmdate("Y-m-d H:i:s", $now);
 
if($VERBOSE){ fwrite(STDERR, "Post to publish: ID=$to_pub_id DATE=$to_pub_date NEW_DATE=$new_date TITLE=$to_pub_title\n"); }
 
# actually publish it
if(! $DRY_RUN){
  $arr = array('ID' => $to_pub_id, 'post_status' => 'publish', 'post_date' => $new_date, 'post_date_gmt' => $new_date_gmt);
  $ret = wp_update_post($arr); // publish the post
  if($ret == 0) {
    fwrite(STDERR, "ERROR: Post $to_pub_id was not successfully published.");
    exit(1);
  }
  if($VERBOSE){ fwrite(STDERR, "Published post. New ID: $ret\n"); }
}
else {
  fwrite(STDERR, "Dry run only, not publishing post.\n");
}
 
# check that the post really was published
$published = get_posts(array('numberposts' => 1, 'orderby' => 'post_date', 'order' => 'DESC', 'post_status' => 'publish'));
$post = $published[0];
$pub_date = $post->post_date;
$pub_id = $post->ID;
$pub_title = $post->post_title;
$pub_guid = $post->guid;
 
if($pub_title != $to_pub_title) {
  fwrite(STDERR, "ERROR: title of most recent post does not match title of what we wanted to post.");
  exit(1);
}
 
fwrite(STDOUT, "Published post $pub_id at $pub_date\n");
fwrite(STDOUT, "Title: $pub_title\n");
fwrite(STDOUT, "\n\n\n GUID/Link: $pub_guid\n");
fwrite(STDOUT, "\n\n".__FILE__." on ".trim(shell_exec('hostname --fqdn'))." running as ".get_current_user()."\n");
 
?>

You’ll need to set WP_LOAD_LOC (line 29) to the full path of your WordPress installation’s wp-load.php (it should be in the top-level directory of your WordPress installation. I run this script from cron like:

0 6 * * 1-5 /home/jantman/bin/wordpress_daily_post.php --verbose # publish WP pending posts daily

so that it runs at 6AM (local time) each weekday. Assuming you have cron setup to send you mail, you’ll get a daily message saying what was (or wasn’t) done.

Nagios / Icinga Configuration Highlighting with GeSHi

As you may know from former posts, this blog (WordPress-powered) and a few MediaWiki sites that I have use the excellent PHP-based GeSHi syntax highlighter. Today I was writing a post that includes some Icinga (Nagios) configuration snippets. After a quick search, I found a Nagios language file for GeSHi on GitHub. Thanks very much to Albéric de Pertat (adepertat) for writing this and providing it to the public.

PHP Script to Query Linode DNS Manager API

I’m in the process of moving all of my public-facing services, currently hosted on a single Linode, to a new virtual machine (still with Linode, of course, just a new CentOS 6 VM). Of course, I’ve got a lot (about 60) of DNS records, spread across 8 domains, that point at the old machine. For name-based vhosts in Apache, my usual procedure is to migrate everything over to the new host and then change DNS, and once the change propagates (I’m using Linode’s DNS hosting, so it makes things a LOT easier but I don’t have rndc reload anymore) I test in a browser and, assuming all is well, disable the vhost on the old server. To do all this, I need an easy way to get a list of all the DNS records that still point to the old machine.

Luckily, to augment their web-based control panel (Linode Manager), Linode has a pretty full-featured API with bindings for Python, Perl, PHP, Ruby, Java and others. While I like Python and I’m starting to learn Perl (by trying to shift most of my non-time-sensitive scripting to it) for my new job, PHP is still my strongest language (and the majority of my existing administrative scripting is written in it, especially handy when it comes time to add a web front-end to things). So I wrote the following script to query Linode’s DNS Manager API using Kerem Durmus’ Linode API PHP wrapper (installation instructions and info at that Github link). The script simply writes all Linode DNS records for all zones to a CSV file (this could take a while if you have a lot of records…).

<?php
  /**
   * Script to pull DNS information for all of your Linode hosted zones, output as CSV.
   *
   * Originally created when I moved DNS from in-house to linode, then started moving subdomains one at a time from my servers to Linode.
   *
   * Uses Kerem Durmus' Linode PHP bindings from <https://github.com/krmdrms/linode/>, many thanks to him for releasing this.
   *
   * INSTALLATION (as per krmdrms README):
   *  pear install Net_URL2-0.3.1
   *  pear install HTTP_Request2-0.5.2
   *  pear channel-discover pear.keremdurmus.com
   *  pear install krmdrms/Services_Linode
   *
   * Also requires php-openssl / php5-openssl
   *
   * USAGE: php linodeDnsToCsv.php
   *
   * Copyright 2011 Jason Antman <http://www.jasonantman.com> <jason@jasonantman.com>, all rights reserved.
   * This script is free for use by anyone anywhere, provided that you comply with the following terms:
   * 1) Keep this notice and copyright statement intact.
   * 2) Send any substantial changes, improvements or bog fixes back to me at the above address.
   * 3) If you include this in a product or redistribute it, you notify me, and include my name in the credits or changelog.
   *
   * The following URL always points to the newest version of this script. If you obtained it from another source, you should
   * check here:
   * $HeadURL: http://svn.jasonantman.com/misc-scripts/linodeDnsToCsv.php $
   * $LastChangedRevision: 25 $
   *
   * CHANGELOG:
   * 2011-12-17 Jason Antman <jason@jasonantman.com>:
   *    merged into my svn repo
   * 2011-09-12 Jason Antman <jason@jasonantman.com>:
   *    initial version of script
   *
   */
 
require_once("/var/www/linode_apikey.php"); // PHP file containing:   define("API_KEY_LINODE", "myApiKeyHere");
require_once('Services/Linode.php');
 
// get list of all domains
$domains = array(); // DOMAINID => domain.tld
try {
  $linode = new Services_Linode(API_KEY_LINODE);
  $result = $linode->domain_list();
 
  foreach($result['DATA'] as $domain)
    {
      $domains[$domain['DOMAINID']] = $domain["DOMAIN"];
    }
}
catch (Services_Linode_Exception $e)
{
  echo $e->getMessage();
}
 
$records = array(); // array of resource records
$linode->batching = true;
foreach($domains as $id => $name)
{
  $linode->domain_resource_list(array('DomainID' => $id));
}
 
try {
  $result = $linode->batchFlush();
 
  foreach($result as $batchPart)
    {
      foreach($batchPart['DATA'] as $rrec)
	{
	  if(! isset($records[$rrec['DOMAINID']])){ $records[$rrec['DOMAINID']] = array();}
	  $records[$rrec['DOMAINID']][$rrec['RESOURCEID']] = array('name' => $rrec['NAME'], 'type' => $rrec['TYPE'], 'target' => $rrec['TARGET']);
	}
    }
}
catch (Services_Linode_Exception $e)
{
  echo $e->getMessage();
}
 
echo '"recid","domain","name","type","target"'."\n";
foreach($domains as $id => $name)
{
  foreach($records[$id] as $recid => $arr)
    {
      echo '"'.$recid.'","'.$name.'","'.$arr['name'].'","'.$arr['type'].'","'.$arr['target']."\"\n";
    }
}
 
 
?>

That will print out a list containing the Linode DNS record id (recid), domain, record name, type and target:

"recid","domain","name","type","target"
"137423","jasonantman.com","","TXT","v=spf1 mx:jasonantman.com -all"
"3597859","jasonantman.com","","MX","linode1.jasonantman.com"
"3488952","jasonantman.com","","mx","linode2.jasonantman.com"
"3472952","jasonantman.com","blog","CNAME","linode1.jasonantman.com"

If you want to, say, search for only records that include host “example”, you could use grep and awk like:

php linodeDnsToCsv.php | grep linode1 | grep -v '"linode1","a"' | awk -F , '{printf "%-27s %-20s %-7s %s\n", $2, $3, $4, $5}' | sed 's/"//g'

I hope this helps someone else out, and saves them a few minutes of coding…

WP-Syntax Plugin GeSHi Path Fix

The Wp-Syntax plugin for WordPress provides syntax highlighting for WordPress blogs via the GeSHi PHP syntax highlighter. Unfortunately, the plugin includes a builtin version of GeSHi (currently 1.0.8.9) in geshi/. As a result, not only are users of the plugin not instructed to use the latest version of GeSHi, but it won’t use a host-wide GeSHi installation that’s already in the PHP include path (i.e. /usr/share/php/), like the the many php-geshi packages offered by repositories including EPEL (for Fedora, CentOS and RHEL).

The fix is quite simple. Just open wp-syntax.php in the wp-syntax/ plugin directory in your favorite text editor and change the GeSHi include line (for WP-Syntax 0.9.12, this is line 53) from:

include_once("geshi/geshi.php");

to:

include_once("geshi.php");

If you already have GeSHi installed in the PHP include path, just remove the geshi directory in your wp-syntax/ plugin directory, flush the WordPress caches (if any), and load a page which uses GeSHi – it should now use the host-wide version. If you want to still use a local version for wp-syntax, you can move things around to where they should be in the wp-syntax/ plugin directory:

mv geshi/geshi.php . && mv geshi/geshi/* geshi/ && rmdir geshi/geshi

Note – if you’re in a shared hosting environment, or are otherwise not able to upgrade the php-geshi package on your server yourself, you might not want to do this.

I also posted about this in the WordPress support forums. Hopefully the WP-Syntax devs will include this change in the next version…

Puppet Syntax Highlighting with GeSHi

This blog is run on wordpress, and I also do quite a bit in PHP, so I’m familiar with the GeSHi syntax highlighter. It’s PHP-based, and can run both as a WordPress plugin (WP-Syntax) and as a PHP module. It also works quite well with the MediaWiki SyntaxHighlight GeSHi extension.

Today I was documenting some Puppet code in a wiki, and realized that I didn’t have syntax highlighting. Well, fellow Linux sysadmin and puppetmaster Jason Hancock was nice enough to post on his blog (Puppet Syntax Highlighting with GeSHi) that he’s developed a GeSHi language file for Puppet, available from GitHub. Many thanks!

php-suhosin syslog issues

I just installed php-suhosin 0.9.29 from EPEL on a CentOS 5.6 box. I’m running a whole bunch of name-based vhosts in Apache, and have a bunch of web apps, so I opted to run suhosin in simulation mode (don’t actually block anything, but log errors) and have it log via syslog to a single file. Unfortunately, when I configured this, the syslog messages started showing up in the wrong place, apparently with the wrong facility and priority. After some roundabout debugging (at first assuming syslogd to be the problem), I determined that, for whatever really strange reason (perhaps an incorrect syslog.h on the EPEL box that built the suhosin package?) the LOG_* constants were incorrect. I looked up the correct integer values in /usr/include/sys/syslog.h and the following configuration directives accomplished the task correctly:

suhosin.log.syslog.facility = 128
; 128 = LOG_LOCAL0
 
suhosin.log.syslog.priority = 5
; 5 = LOG_NOTICE

This one line puts suhosin into simulation mode, where it only logs errors instead of enforcing on them:

suhosin.simulation = On

Managing Ubiquiti Networks MAC ACLs from a script

I have a small web-based tool for allowing members of an organization to register their wireless MAC addresses, and then automatically adding them to the MAC ACL on Ubiquiti AirOSv2 APs. It’s a pretty quick hack, along with a simple and ugly web-based tool, but it gets the job done for a non-profit with only 25 people. After posting about it on the Ubiquiti forum and getting a request from someone for the code, I decided to put it out there for anyone who wants it. The script is mostly based on SCPing configs to and from the AP and SSHing in to run commands, and will need passwordless public key auth to the AP.

The code itself is in subversion at http://svn.jasonantman.com/misc-scripts/ubiquiti-mac-acl/. It’s composed of four files:

  • updateAPconfigs.php.inc – the main PHP file with three functions for working with the APs
  • wirelessTools.php – My PHP page for users to add MACs. It’s pretty rough and is mostly based on handling our LDAP authentication/group framework, but it gives a fair example of how I store MACs in a MySQL table and then rebuild a given AP config file with the current list of MACs. I doubt it will be useful to anyone else as more than an example.
  • wireless.sql – The schema for the SQL database I use to store MACs.
  • README.txt – Readme file including some warnings on the lack of error checking in the functions.

Hopefully this will be of some use to someone. I should probably mention two important things here. First, the AP only accepts up to 32 MAC addresses, so if you feed the makeNewConfigFile() function an array with more than 32, it will just stop at the 32nd. Also, be aware, this SCPs a config file to the AP, runs cfgmtd and the reboots the AP. If you send it a bad config file, who knows what will happen. If you allow your users to add MAC addresses, your APs will reboot every time someone adds one.

All I ask is that if you use this, leave a comment to thank me, and if you make any changes/additions/bugfixes, please send them back to me.

Also, I have some Nagios check scripts that are useful for Ubiquiti APs.

Documentation generation for web apps – PHP and JavaScript

Recently I’ve been making some changes to a relatively complex ePCR (electronic patient care report) program that I wrote for the ambulance corps. It’s a web application (available only on our LAN, of course) written in PHP, with a relatively large chunk of custom javascript to provide Ajax/DHTML functions. Most of the PHP code was already documented and processed with phpDocumentor (phpdoc) to generate API documentation. However, since so much of the functionality is DHTML-based, there was a lot of looking back to the JavaScript source to figure out what was called where.

My search for a true multi-language documentation generator was relatively fruitless. There’s doxygen but that needed a Perl helper script for javascript files. Since virtually all of the code, both PHP and JavaScript, is purely procedural, I was really only concerned about docblocks and the functions they precede.

Luckily, it occurred to me that JavaScript is pretty close in syntax to PHP, and I tend to write them with exactly the same style. A little research showed that phpdoc can more or less handle javascript code, with a few caveats:

  • The code needs to parse as PHP, so things like inline functions mess it up.
  • The default phpDocumentor ini file doesn’t recognize files with .js extensions.
  • The files need to have a <?php at the top.

Noting this, I wrote a small script that iterates through a directory of .js files, parses them line by line, pulls out only the function declarations (which, hopefully, don’t also have code on the same line) and docblocks, and writes the output (with a <?php at the top) to a same-named file in a different directory.

The script obviously requires phpdoc to be installed, and also requires you to edit the phpDocumentor.ini file (installed with PEAR on my system at /usr/share/php5/PEAR/data/PhpDocumentor/phpDocumentor.ini) and add a “js” line to the [_phpDocumentor_phpfile_exts] section to get phpdoc to recognize *.js files.

I was easily able to integrate this with a Makefile rule and create a single set of cross-linked phpdoc API docs including both JS and PHP files. I also added explicit package names (like “-PHP” and “-JS”) to keep things separated a little.

The script can be found at: http://svn.jasonantman.com/misc-scripts/js2phpdoc.php. It’s (obviously) free for any use, provided that you follow the license terms (leave copyrights intact, send modifications back to me, and update the changelog if you modify it).

My Makefile rule (which uses a temp directory to both keep the generated files separate from the source and keep the file paths as seen by phpdoc the same as the actual source):

.PHONY: docs

docs:
mkdir -p temp/js
bin/js2phpdoc.php js/ temp/js/
cp -r inc temp/
cp *.php temp/
phpdoc -c docs/default.ini
rm -Rf temp

PHP Script to Dump Firefox Session

If you’re anything like me, you often find yourself working on multiple computers. Today I left a few tabs open in Firefox on my work laptop, and wanted to continue reading from my desktop. Normally I’d just grab the laptop, or RDP into it if it was my work desktop that had the open tabs, but at the moment my girlfriend is neck-deep in WoW on the MacBook. Having had this problem before (getting tabs back remotely, not a laptop occupied with WoW), I started thinking about a solution.

I could have closed my local firefox session, moved the sessionstore.js somewhere else, copied the one from the laptop over, re-opened firefox, … well, you get the idea.

But that sounds like a really sub-optimal solution. So I started looking around a bit. It seems that sessionstore.js is almost JSON, but as per Mozilla bug 407110, it’s not quite standards-compliant. Luckily, it seems that PHP’s JSON module is quite tolerant, so once I stripped off the leading and trailing parens from the file contents, it parsed quite nicely.

I’ve written a small dumpFirefoxSession.php script that reads the sessionstore.js file (in cwd or a specified path), unserializes the JSON as an array, and then dumps the tabs. It dumps as either plain text or HTML (currently just elements inside the body, not a full HTML file). The HTML will include ols for each window listing the tabs, links to the current content (sessionstore.js also holds history for each tab, but I don’t need this), and it shows which tab is currently selected.

You can grab the script from subversion at: http://svn.jasonantman.com/misc-scripts/dumpFirefoxSession.php. The current version is 3. You’ll need PHP (probably 5) with JSON support.

Apache catchall vhost

As mentioned in One of my recent posts, I occasionally have to setup catchall pages in Apache. The general idea is usually that I either want a vhost that serves one page for any conceivable request, or that I moved something and want to alert the visitor, but provide a formula-based link to the new content. Assuming you have mod_rewrite, this is relatively simple.

In your vhost configuration (or .htaccess), you just need two lines:

RewriteEngine on
RewriteCond %{REQUEST_URI} !/index\.php$1
RewriteRule ^(.*)$ /index.php$1 [L]

This will redirect every request for the vhost to /index.php. Within your PHP script, you can access the actual request URI through $_SERVER["REQUEST_URI"]. The script that I’m currently using for an internal page is:

$newServer = "http://foo.example.com:12345";
 
if($_SERVER["REQUEST_URI"] == "/" || $_SERVER["REQUEST_URI"] == "/index.php")
  {
    header("Location: ".$newServer);
  }
else
  {
    $newURL = $newServer.$_SERVER["REQUEST_URI"];
    echo '<html><head><title>Page Moved</title>';
    echo '<META HTTP-EQUIV="refresh" CONTENT="5;URL='.$newURL.'">';
    echo '</head><body>';
    echo '<p>The page you are looking for is best found at:</p>';
    echo '<p><strong><a href="'.$newURL.'">'.$newURL.'</a></strong></p>';
    echo '<p>You will be automatically redirected after 5 seconds. If this does not happen, click the link above.</p>';
    echo '</body></html>';
  }

This script takes two distinct actions:

  • If the requested URL is / or /index.php, it transparently redirects to a different URL (and port).
  • Otherwise, it displays a “page moved to” message and uses a Meta-Refresh to redirect after 5 seconds.