Changes

Jump to navigation Jump to search
no edit summary
**doesn't need to have a read.csv side function - no room for failure, no need to test
**Make ***one query*** per iteration, please.
 
===7/19: Application on Todd's Hub Project Pt.II===
*As documented on <code>twitter-python</code> documentation, there is no direct way to filter timeline query results by start date/end date. So I've decided to write a support module <code>time_signature_processor</code> to help with counting the number of tweets that have elapsed since a month ago
**first-take with <code>from datetime import datetime</code>
**usage of datetime.datetime.stptime() method to parse formatted (luckily) date strings provided by <code>twitter.Status</code> objects into smart datetime.datetime objects to support mathematical comparisons (i.e. <code>if tweet_time_obj < one_month_ago_obj: </code>
**Does not support timezone-aware counting. current python version (2.7) does not support timezone-awareness in my datetime.datetime objects.
***'''functionality to be subsequently improved'''
*To retrieve data regarding # of following for each shortname, it seems like I have to call <code>twitter.api.GetUser()</code> in addition to <code>twitter.api.GetTimeline</code>. To ration token usage, I will omit this second call for now.
**'''functionality to be subsequently improved'''
*Improvements to debugging interface and practice
**Do note Komodo IDE's <code>Unexpected Indent</code> error message that procs when it cannot distinguish between whitespaces created by /tab or /space. Use editor debugger instead of interactive shell in this case. Latter is tedious and impossible to fix.
*data structure <code>pandas.DataFrame</code> can be built in a smart fashion by putting together various dictionaries that uses list-indices and list-values as key-value pairs in the df proper. More efficient than past method of creating empty table then populating it cell-by-cell.
raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'],
'age': [42, 52, 36, 24, 73],
'preTestScore': [4, 24, 31, 2, 3],
'postTestScore': [25, 94, 57, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'preTestScore', 'postTestScore'])
df

Navigation menu