Wednesday, 31 December 2008

Tweet-SQL Released!

The first version of Tweet-SQL has been released just before 2009! Have a look at the features Tweet-SQL boasts, check out some screenshots, and download Tweet-SQL!

I look forward to improving Tweet-SQL even more over 2009.

Happy New Year!
Blogged with the Flock Browser

Friday, 19 December 2008

twtki: a wiki built by tweets

One interesting thing about the Twitter API is the vast number of applications where it has been put to use. Yesterday I came across twtki which is literally a wiki built by the tweets sent to @twtki. Nice!

Here's how Tweet-SQL could be used to pull down the @replies from @twtki and fed into a database.

First lets create a table for holding these...

CREATE TABLE twtki

(

      created_at DATETIME,

      id INT PRIMARY KEY,

      [text] NVARCHAR(240),

      source NVARCHAR(100),

      truncated NVARCHAR(5),

      in_reply_to_status_id INT,

      in_reply_to_user_id INT,

      favorited NVARCHAR(5),

      status_Id INT,

      statuses_Id INT,

      definer NVARCHAR(240),

      topic NVARCHAR(240),

      definition NVARCHAR(240)     

);

Sadly there seems to be various issues with the since_id parameter. The since_id parameter does not function the way you would expect, i.e. if since_id = 100 then you would expect it to return 101-120 but it actually returns the 20 most recently sent @replies, i.e. 500-519. Combining this with the count parameter as mentioned in the google group doesn't seem to help but then this isn't listed in the API docs for the replies method.

This only seems to be an issue if you're bothered about archiving all of your @replies. This script picks up new @replies without any evident issue but should a large number of @replies be received between checks some may be missed. It would be relatvely easy to switch to using the page parameter and then cross reference status_ids if you think this will be an issue for you.

Problems aside here a script that pull down @replies, stores them in a table and attempts to format the status text into definer, topic and definition as stated on twtki.

-- Setup variables

DECLARE @since_id INT,

            @xml XML,

            @handle INT,

            @optional NVARCHAR(50),

            @rowcount TINYINT;

 

-- Get the last id to run from

SET @since_id = (SELECT ISNULL(MAX(id), 1) FROM twtki);

-- Setup the @optional paramter with since_id

SET @optional = '?since_id=' + CAST(@since_id AS NVARCHAR(10));

-- Set @rowcount so we pass hrough the loop at least once

SET @rowcount = 20;

 

-- Turn of resultsets from Tweet-SQL

EXEC dbo.tweet_cfg_resultset_send 0;

-- Get 20 @replies

EXEC dbo.tweet_sts_replies @optional, @xml OUTPUT;

 

WHILE (@rowcount = 20) -- While we still have results to deal with

BEGIN

      -- Prepare an xml document

      EXEC sp_xml_preparedocument @handle OUTPUT, @xml;

 

      -- Insert the @replies into our twtki table

      INSERT INTO dbo.twtki

      (

            created_at,

            id,

            [text],

            source,

            truncated,

            in_reply_to_status_id,

            in_reply_to_user_id,

            favorited,

            status_Id,

            statuses_Id,

            definer,

            topic,

            definition

      )

      SELECT  dbo.tweet_fnc_dateconvert(created_at),

                  id,

                  [text],

                  source,

                  truncated,

                  in_reply_to_status_id,

                  in_reply_to_user_id,

                  favorited,

                  status_Id,

                  statuses_Id,

                  SUBSTRING([text], 2, CHARINDEX(' ', [text]) -1) AS Definer,

                  CASE

                        WHEN [text] LIKE '%: %' THEN

                              SUBSTRING([text], CHARINDEX(' ', [text]), CHARINDEX(': ', [text]) - CHARINDEX(' ', [text]))

                        ELSE

                              'Malformed'

                  END AS Topic,

                  CASE

                        WHEN [text] LIKE '%: %' THEN

                              SUBSTRING([text], CHARINDEX(': ', [text]) + 1, 240)

                        ELSE

                              'Malformed'

                  END AS Definition

      FROM OPENXML (@handle, '/statuses/status', 2)

      WITH

      (

            created_at NVARCHAR(30),

            id INT,

            [text] NVARCHAR(140),

            source NVARCHAR(100),

            truncated NVARCHAR(5),

            in_reply_to_status_id INT,

            in_reply_to_user_id INT,

            favorited NVARCHAR(5),

            status_Id INT,

            statuses_Id INT

      );

 

      -- Get the rowcount, when < 20 then we have the last page of @replies for now

      SET @rowcount = @@ROWCOUNT;

      -- Which page have we done?

      PRINT 'Retrieved ' + CAST(@rowcount AS NVARCHAR(2)) + ' @replies.';

      -- destroy the xml document

      EXEC sp_xml_removedocument @handle;

      -- Update the since_id value.

      SET @since_id = (SELECT MAX(id) FROM twtki);

      -- Setup the @optional paramter with since_id

      SET @optional = '?since_id=' + CAST(@since_id AS NVARCHAR(10));

 

      -- Wait for a bit adjust as appropriate

      WAITFOR DELAY '00:00:05';

      -- Get the next batch of @replies

      EXEC dbo.tweet_sts_replies @optional, @xml OUTPUT;

 

END

 

-- Turn resultsets from Tweet-SQL back on as appropriate

EXEC dbo.tweet_cfg_resultset_send 1;



Thursday, 18 December 2008

Archiving another Twitter user's timeline with Tweet-SQL

Here's a modified version of the script I created to archive your twitter timeline with Tweet-SQL. This script will allow you to archive another users timeline provided they haven't protected their updates. This script will be good for archiving 2000 statuses before you hit your api limit. In a future script I'll show how you can archive even larger timeline using the since_id and count in the @optional parameter.

First create a table to hold the users timeline...

CREATE TABLE TwitterArchive

(

      created_at DATETIME,

      id INT PRIMARY KEY,

      [text] NVARCHAR(140),

      source NVARCHAR(100),

      truncated NVARCHAR(5),

      in_reply_to_status_id INT,

      in_reply_to_user_id INT,

      favorited NVARCHAR(5),

      status_Id INT,

      statuses_Id INT                                                              

);



Just modify the value for @user_id_or_nick and you're good to go...


DECLARE @xml XML,

            @handle INT,

            @page INT,

            @optional NVARCHAR(10),

            @rowcount TINYINT,

            @user_id_or_nick NVARCHAR(30);


-- Set the users timeline to archive

SET @user_id_or_nick = 'lost_in_bangkok';

-- Starting page

SET @page = 1;

-- Turn off resultsets from Tweet-SQL

EXEC dbo.tweet_cfg_resultset_send 0;

-- Set the optional parameter to request page 1

SET @optional = '?page=' + CAST(@page AS NVARCHAR(4));

-- Get page 1 of your timeline

EXEC dbo.tweet_sts_user_timeline @user_id_or_nick, @optional, @xml OUTPUT;

SET @rowcount = 20;

WHILE (@rowcount = 20) -- While we still have results to deal with

BEGIN

      -- Prepare an xml document

      EXEC sp_xml_preparedocument @handle OUTPUT, @xml;

      -- Insert the page into a table

      INSERT INTO dbo.TwitterArchive

      (

            created_at,

            id,

            [text],

            source,

            truncated,

            in_reply_to_status_id,

            in_reply_to_user_id,

            favorited,

            status_Id,

            statuses_Id

      )

      SELECT  dbo.tweet_fnc_dateconvert(created_at),

                  id,

                  [text],

                  source,

                  truncated,

                  in_reply_to_status_id,

                  in_reply_to_user_id,

                  favorited,

                  status_Id,

                  statuses_Id

      FROM OPENXML(@handle, '/statuses/status', 2)

      WITH

      (

            created_at NVARCHAR(30),

            id INT,

            [text] NVARCHAR(140),

            source NVARCHAR(100),

            truncated NVARCHAR(5),

            in_reply_to_status_id INT,

            in_reply_to_user_id INT,

            favorited NVARCHAR(5),

            status_Id INT,

            statuses_Id INT

      );

      -- Get the rowcount, when < 20 then we have the last page

      SET @rowcount = @@ROWCOUNT;

      -- Which page have we done?

      PRINT 'Archived page ' + CAST(@page AS NVARCHAR(4));

      -- destroy the xml document

      EXEC sp_xml_removedocument @handle;

      -- Increment the page count

      SET @page = @page + 1;

      -- Setup the optional parameter

      SET @optional = '?page=' + CAST(@page AS NVARCHAR(4));

      -- Wait for a bit...

      WAITFOR DELAY '00:00:05';

      -- Get the next page

      EXEC dbo.tweet_sts_user_timeline @user_id_or_nick, @optional, @xml OUTPUT;

END

-- Turn resultsets from Tweet-SQL back on as appropriate

EXEC dbo.tweet_cfg_resultset_send 1;