Friday, 19 December 2008

twtki: a wiki built by tweets

One interesting thing about the Twitter API is the vast number of applications where it has been put to use. Yesterday I came across twtki which is literally a wiki built by the tweets sent to @twtki. Nice!

Here's how Tweet-SQL could be used to pull down the @replies from @twtki and fed into a database.

First lets create a table for holding these...

CREATE TABLE twtki

(

      created_at DATETIME,

      id INT PRIMARY KEY,

      [text] NVARCHAR(240),

      source NVARCHAR(100),

      truncated NVARCHAR(5),

      in_reply_to_status_id INT,

      in_reply_to_user_id INT,

      favorited NVARCHAR(5),

      status_Id INT,

      statuses_Id INT,

      definer NVARCHAR(240),

      topic NVARCHAR(240),

      definition NVARCHAR(240)     

);

Sadly there seems to be various issues with the since_id parameter. The since_id parameter does not function the way you would expect, i.e. if since_id = 100 then you would expect it to return 101-120 but it actually returns the 20 most recently sent @replies, i.e. 500-519. Combining this with the count parameter as mentioned in the google group doesn't seem to help but then this isn't listed in the API docs for the replies method.

This only seems to be an issue if you're bothered about archiving all of your @replies. This script picks up new @replies without any evident issue but should a large number of @replies be received between checks some may be missed. It would be relatvely easy to switch to using the page parameter and then cross reference status_ids if you think this will be an issue for you.

Problems aside here a script that pull down @replies, stores them in a table and attempts to format the status text into definer, topic and definition as stated on twtki.

-- Setup variables

DECLARE @since_id INT,

            @xml XML,

            @handle INT,

            @optional NVARCHAR(50),

            @rowcount TINYINT;

 

-- Get the last id to run from

SET @since_id = (SELECT ISNULL(MAX(id), 1) FROM twtki);

-- Setup the @optional paramter with since_id

SET @optional = '?since_id=' + CAST(@since_id AS NVARCHAR(10));

-- Set @rowcount so we pass hrough the loop at least once

SET @rowcount = 20;

 

-- Turn of resultsets from Tweet-SQL

EXEC dbo.tweet_cfg_resultset_send 0;

-- Get 20 @replies

EXEC dbo.tweet_sts_replies @optional, @xml OUTPUT;

 

WHILE (@rowcount = 20) -- While we still have results to deal with

BEGIN

      -- Prepare an xml document

      EXEC sp_xml_preparedocument @handle OUTPUT, @xml;

 

      -- Insert the @replies into our twtki table

      INSERT INTO dbo.twtki

      (

            created_at,

            id,

            [text],

            source,

            truncated,

            in_reply_to_status_id,

            in_reply_to_user_id,

            favorited,

            status_Id,

            statuses_Id,

            definer,

            topic,

            definition

      )

      SELECT  dbo.tweet_fnc_dateconvert(created_at),

                  id,

                  [text],

                  source,

                  truncated,

                  in_reply_to_status_id,

                  in_reply_to_user_id,

                  favorited,

                  status_Id,

                  statuses_Id,

                  SUBSTRING([text], 2, CHARINDEX(' ', [text]) -1) AS Definer,

                  CASE

                        WHEN [text] LIKE '%: %' THEN

                              SUBSTRING([text], CHARINDEX(' ', [text]), CHARINDEX(': ', [text]) - CHARINDEX(' ', [text]))

                        ELSE

                              'Malformed'

                  END AS Topic,

                  CASE

                        WHEN [text] LIKE '%: %' THEN

                              SUBSTRING([text], CHARINDEX(': ', [text]) + 1, 240)

                        ELSE

                              'Malformed'

                  END AS Definition

      FROM OPENXML (@handle, '/statuses/status', 2)

      WITH

      (

            created_at NVARCHAR(30),

            id INT,

            [text] NVARCHAR(140),

            source NVARCHAR(100),

            truncated NVARCHAR(5),

            in_reply_to_status_id INT,

            in_reply_to_user_id INT,

            favorited NVARCHAR(5),

            status_Id INT,

            statuses_Id INT

      );

 

      -- Get the rowcount, when < 20 then we have the last page of @replies for now

      SET @rowcount = @@ROWCOUNT;

      -- Which page have we done?

      PRINT 'Retrieved ' + CAST(@rowcount AS NVARCHAR(2)) + ' @replies.';

      -- destroy the xml document

      EXEC sp_xml_removedocument @handle;

      -- Update the since_id value.

      SET @since_id = (SELECT MAX(id) FROM twtki);

      -- Setup the @optional paramter with since_id

      SET @optional = '?since_id=' + CAST(@since_id AS NVARCHAR(10));

 

      -- Wait for a bit adjust as appropriate

      WAITFOR DELAY '00:00:05';

      -- Get the next batch of @replies

      EXEC dbo.tweet_sts_replies @optional, @xml OUTPUT;

 

END

 

-- Turn resultsets from Tweet-SQL back on as appropriate

EXEC dbo.tweet_cfg_resultset_send 1;



No comments: