Tuesday 6 January 2009

Twitter Sources: What Apps are being used?

Some of you may have seen this neat twitter mashup over on funkatron which provides some nice stats of what Twitter Apps are being used by sampling the public timeline.

Funkatron says...

"This data is derived from 20-message slices taken every minute from the Twitter public timeline. It currently reports the overall usage since the data collection began (2008-07-31), as well as a few common time periods."

I thought it would be fun to replicate this with Tweet-SQL. The Funkatron page does allow you to drill down through various date ranges but I've just done a total count for the sake of simplicity. A few modifications to the script and you could also break by date if needed. Please be aware I've used a couple of features new to SQL Server 2008 so if you're using 2005 you need to modify it to work (script is commented in the appropriate places).

This script will call the public timeline every one minute for the number of iterations specified. In the example below @iterations is set to 100 so this script will take over 100 minutes to complete!


-- Create the TwitterSources table if it doesn't already exist
IF (OBJECT_ID('TwitterSources', 'U')) IS NULL
BEGIN
CREATE TABLE TwitterSources
(
[source] NVARCHAR(100) PRIMARY KEY NOT NULL,
[count] INT NOT NULL DEFAULT 0
);
END

-- Using 2008 variable initialization here, change if using 2005
DECLARE @xml XML,
@handle INT,
@iterations INT = 100,
@count INT = 0;

-- Turn of resultsets from Tweet-SQL
EXEC dbo.tweet_cfg_resultset_send 0;

-- Loops reading the public timeline for the specified number of iterations
WHILE (@count < @iterations)
BEGIN

-- Get the pubic timeline
EXEC dbo.tweet_sts_public_timeline @xml OUTPUT;

-- Prepare an xml document
EXEC sp_xml_preparedocument @handle OUTPUT, @xml;

SELECT source, COUNT(*) AS [count]
INTO #PublicTimelineSource
FROM OPENXML (@handle, '/statuses/status', 2)
WITH
(
source NVARCHAR(100)
)
GROUP BY source;

-- Add this timelines source count to the TwitterSources
-- This is also a 2008 feature, rewrite if using 2005!
MERGE dbo.TwitterSources AS [target]
USING #PublicTimelineSource AS temp
ON [target].source = temp.source
WHEN MATCHED THEN
UPDATE SET [target].[count] = [target].[count] + temp.[count]
WHEN NOT MATCHED THEN
INSERT ([source], [count])
VALUES (temp.source, temp.[count]);

-- destroy the xml document
EXEC sp_xml_removedocument @handle;

-- Drop the tempory table
DROP TABLE #PublicTimelineSource;

-- Wait for a one minute
-- The public_timeline is cached for 1 minute so no pointed less than this
WAITFOR DELAY '00:01:00';

-- Increment count
SET @count = @count + 1;
PRINT 'Completed iteration ' + CAST(@count AS VARCHAR(5));

END

-- Turn resultsets from Tweet-SQL back on as appropriate
EXEC dbo.tweet_cfg_resultset_send 1;

The TwitterSources table will be populated with a count of all the Twitter apps that were encountered from the public timeline sampling. Here are the top ten rows from one of my test runs...

web - 959

txt - 174

twitterfeed - 156

twitterrific - 142

TweetDeck - 137

TwitterFox - 84

twhirl - 81

TwitterBerry - 54

Twinkle - 30

Tweetie - 30

No comments: