Design Decisions and Thought Process

I decided on creating a small web app where you can input a stock symbol and some basic information for the stock is received (price, change, volume) as well as counting the number of times the stock was mentioned on Twitter.  I thought it would be interesting to see a relationship between the number of tweets (and tweet content) about a stock and the change in the stock's price.  One stock of note was Google.  Their earning's report, which was not positive, was released early at 12:30 p.m., instead of after the market closed.  The stock dropped 8% in value and there was a massive amount of activity on Twitter.

I created the app in Django and Python but I had virtually no experience in either.  I made this decision because I know Points uses them and I wanted to prove to Points and myself that I could learn it in a short amount of time.  It was definitely a lot to take in for such a short period of time, but I learned a lot.  Hopefully, my lack of experience in Django isn't too apparent...

I receive the stock information through a Yahoo API where I can request a URL with the symbol and some character codes specifying which information I want.  It returns the data in a CSV format.  I was looking into using Google's Finance API, however it was shut down very recently on October 20th!

I calculate the number of tweets regarding the stock using Twitter's search API.  Conveniently, stock symbols are searchable by a 'cashtag' by appending the '$' symbol in front of the stock symbol (e.g. $GOOG).  Inconveniently, Twitter only allows a maximum of 100 results per request and only 15 pages of results are searchable for a total of 1500 tweets.   I loop through the results to find the number of tweets within the last n hours.  The algorithm is as follows:
- loop 15 times for each page of results
- loop backwards from the current page's results comparing the tweets time and the time n hours ago, stop when the tweets time is greater than the time n hours ago
Since we receive the tweets in chronological order, we know if the last tweet's time is less than the time at n hours ago we know we want to count all the results from the page.  If it is greater, we loop until we find the tweet which time is within our timeframe.

I considered using a binary search for the last loop, but I figured it wasn't really worth it because the loop runs a constant amount of times (100 in the worst case), lg(100) wouldn't be significantly faster.

The user can also click on the stock's row in the table to view some recent tweets.  I removed the retweets (tweets beginning with "RT").  If I simply displayed the 10 most recent tweets, it is entirely possible for all of them to be identical retweets, which wouldn't be interesting.

I used Twitter's Bootstrap framework for the look and feel of the website.  I am not a great designer, but by using their framework I was able to get a clean and nice-looking appearance.
