If you are a WordPress developer looking for an interesting new project, or if you have an existing WordPress statistics plugin and are looking for ways to add more to it, here is the kind of plugin I would love to have available to me….
Last night I wanted to create a graph of the site-wide quarterly trend in the number of blog posts published across ALL blogs on our blogs.voxeo.com corporate blog portal. After looking through a great number of potential WordPress plugins (with special thanks to the always awesome Andrea_R for her pointers), I got rather frustrated because most of them seem to focus on either site-wide visitor stats, which I’m already tracking through Google Analytics, or provided some “activity” stats – but only for a single blog. (And if I missed one that does what I outline below, I’d love to hear about it.)
I did finally get the kind of chart that I wanted… but through a kludgey use of SQL and Excel that I’ll describe below. Here was my end goal:
My purpose in this was somewhat simple – I wanted to see and chart the growth of our content creation efforts on the blog server. I want to show trends in the amount of content we are publishing… and then potentially tie those in to other trends such as increased visits (which I track through Google Analytics) and increased sales inquiries (tracked through Google Analytics and SalesForce.com).
I have other reasons, too. We have set up a number of different blogs on different topics. I would like to be able to see which of the blogs we are writing in more often than others… and which might need some attention – or potentially archiving. I’d like a chart like the one above for each individual blog over time… or perhaps with the ability to graph multiple blogs on the same chart as a comparison. You could see this being interesting data in a larger organization with different teams creating different blogs… to see which teams are creating the most online content.
To that point, it would be interesting to see which users are generating the most content… conceivably across all blogs. In part from a individual performance point-of-view, and in part from a site-wide “health” point-of-view. Our blog portal used to only have posts written by me. Now there are a good number of writers and to me that is the sign of a healthy content creation environment. As the one responsible for the overall site, I’d like to be able to see how the contributions are across the site.
I’d also like to be able to see the trend in comments. How many comments? How many track/pingbacks?
I’d like this data available in a quarterly basis, as I’ve shown here, and perhaps also in a monthly or weekly basis. Maybe even yearly if the site has been around for a bit.
THE IDEAL PLUGIN
In my mind, the ideal plugin would be something that is a panel in my WordPress admin interface where I go to generate these reports. For performance, I wouldn’t want this hitting the database and running all the time – but just there to generate on-demand reports. Some of the ideas I would like to see:
- Allow me to set a date range
- Choose from weekly, monthly, quarterly and yearly (perhaps daily needs to be in there, too)
- Allow me to choose which of the blogs I want to get trends on (some of our blogs are experimental or, like our Events blog, have pages only and no posts)
- Choose whether to get trends on only public blogs or also private blogs
- Choose whether to aggregate all data together or to graph blogs separately
- Graph different types of data:
- Count of posts by blog (or in aggregate)
- Blog posts by user
- Comments by blog (or in aggregate)
- Allow some modification of the title, legend, etc.
- Or… alternatively… just create a CSV file with all the data that could be brought into Excel
There’s probably more, like overall word count and stats like that… but that’s a start. If you are reading this, what would you like to add?
To get the chart I wanted, I had to sweep away some of the cobwebs on the SQL knowledge in my brain and play around at the MySQL command line. I logged into the blog server, launched “mysql“, connected to the “wordpress” database and started playing around with “select” statements. There is a table, wp_blogs, that lists all of your blogs and, most importantly, shows the “blog_id” for each blog. For each blog, there is then a table named “wp_<blog_ID>_posts“, as in “wp_4_posts” which contains the info about the posts, dates, authors, etc. I found that I could get the data I wanted by doing this:
select ID, post_title, post_date from wp_4_posts where post_type = 'post' and post_status='publish';
Now, I didn’t include “post_author” in my work last night, but I wish I did because it would have let me gather the author stats I want. Anyway, I continued mucking around until I eventually came up with a SQL statement that gave me a pipe-delimited text file that I could easily import into Excel on my Mac. (I couldn’t use commas because some of my post titles have commas!) Naturally I then had to automate the process so I hacked up a very quick python script to create the relevant SQL statement and then send that to MySQL.
for i in [4,7,8,11,12,13,15,16,19,22,24]: # This is a list of the blog_ids of the blogs I care about
target = "select ID, post_title, post_date from wp_"+str(i)+\ "_posts where post_type = 'post' and post_status='publish' ORDER BY post_date INTO OUTFILE '/tmp/b"+\ str(i)+ ".txt' FIELDS TERMINATED BY '|';"
os.system('mysql wordpress -e "'+target+'"')
Like I said… a total kludge.
The end result was a series of pipe-delimited text files that I could import into Excel, combine together, and do some kludging there to get me the quarterly data which I could then chart.
If I knew how to use the various MySQL modules for python, I’m sure it could have been done much more elegantly. (And yes, I know that the python “os” module is deprecated in newer python versions and “subprocess” should be used instead… I’m old school and this was a quick late-night hack.) Or if my SQL knowledge weren’t so rusty, there’s probably some big SELECT statement that could join this all together and give me a nice unified output, perhaps even with the counts I want.
And if someone out there is handy with SQL or already has a WordPress plugin that does some or all of this… it would be great to see! Given that my usage of WordPress MU (yes, I’m still on 2.9.2, but will be moving to 3.0 soon) is for a corporate blog portal, these are the kinds of statistics and trends that are of interest. I would think that even if someone were running a more open site, these kind of trends would be helpful to know and understand.
Hopefully there’s just some plugin out there that I’ve missed!
What do you think? What would you want in a plugin like this?