<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>some thoughts &#187; stats</title>
	<atom:link href="http://programmerthoughts.com/tags/stats/feed/" rel="self" type="application/rss+xml" />
	<link>http://programmerthoughts.com</link>
	<description></description>
	<lastBuildDate>Sat, 28 Aug 2010 23:27:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Cloud Files CDN Stats</title>
		<link>http://programmerthoughts.com/programming/cloud-files-cdn-stats/</link>
		<comments>http://programmerthoughts.com/programming/cloud-files-cdn-stats/#comments</comments>
		<pubDate>Tue, 09 Feb 2010 22:57:14 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[Cloud Files]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[cloud files]]></category>
		<category><![CDATA[logs]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[stats]]></category>

		<guid isPermaLink="false">http://programmerthoughts.com/?p=385</guid>
		<description><![CDATA[I wrote a small Python script that loads Cloud Files CDN log files and aggregates the data. The code is available in <a href="http://github.com/notmyname/python_scripts/tree/master/cf_stats/">my github account</a>.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.rackspacecloud.com/cloud_hosting_products/files">Cloud Files</a> offers public content through Limelight&#8217;s CDN network. On public containers, one can opt in to save the logs for all content requested from the CDN. These logs are for the raw usage in an apache log format and are stored compressed in a container named &#8220;.CDN_ACCESS_LOGS&#8221;. One can then parse these logs with any commercial analytics tool or use a custom solution. Being a developer, I wrote a small Python script that loads these log files and aggregates the data.</p>
<p>The code can be found in <a href="http://github.com/notmyname/python_scripts/tree/master/cf_stats/">my github repository</a>.</p>
<p>After updating the code with your own Cloud Files credentials (or using your own cf_auth module), usage is similar to the following:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
</pre></td><td class="code"><pre class="shell" style="font-family:monospace;">$ ./cf_stats.py obj_name</pre></td></tr></table></div>

<p>&#8220;obj_name&#8221; is one of the keys the stats can be grouped on. Others include &#8220;date&#8221;, &#8220;container_name&#8221;, and &#8220;user_agent&#8221;. The default is &#8220;obj_name&#8221; and any incorrect parameter will generate a usage message.</p>
<p>Sample output:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
</pre></td><td class="code"><pre class="shell" style="font-family:monospace;">Object Name: my_file.pdf
Count: 11
User Agents: &quot;Yandex/1.01.001 (compatible; Win16; I)&quot;
Response: 200 304
Referrers: -
IPs: 1.2.3.4 1.2.3.5 1.2.3.6
Dates: 24/Jan/2010 25/Jan/2010 31/Jan/2010 01/Jan/2010 30/Dec/2009
Container Name: some_container</pre></td></tr></table></div>

<p>Any of the given fields can be used as a group. Even if the code output as-is is not to your liking, the script&#8217;s parsing and grouping functions my be a good starting point for writing your own log parser.</p>
]]></content:encoded>
			<wfw:commentRss>http://programmerthoughts.com/programming/cloud-files-cdn-stats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
