Difference between revisions of "Jeemin Sim (Work Log)"

From edegan.com
Jump to navigation Jump to search
Line 9: Line 9:
 
2/15/2017 WEDNESDAY 9AM-11AM
 
2/15/2017 WEDNESDAY 9AM-11AM
 
Discussed with Catherine what to do with FDA Trial data and decided to have a dictionary with zip-codes as keys and number of trials occurred in that zipcode as values. Was still attempting to loop through the files without the code having to exist in the same directory as the XML files. Plan to write to excel via tsv, with zip-code as one column and # of occurrence as the other.
 
Discussed with Catherine what to do with FDA Trial data and decided to have a dictionary with zip-codes as keys and number of trials occurred in that zipcode as values. Was still attempting to loop through the files without the code having to exist in the same directory as the XML files. Plan to write to excel via tsv, with zip-code as one column and # of occurrence as the other.
 +
 +
2/17/2017 FRIDAY 2PM-6PM
 +
Completed code for counting the number of occurrences for each unique zipcode. (currently titled & located: E:\McNair\Projects\FDA Trials\Jeemin_Project\Jeemin_Running_File.py). It has been running for 20+min because of the comprehensive XML data files. Meanwhile started coding to create a dictionary with the keys corresponding to each unique trial ID, mapped to every other information (location, sponsors, phase, drugs ...etc.) (currently titled & located: E:\McNair\Projects\FDA Trials\Jeemin_Project\Jeemin_FDATrial_as_key_data_ripping.py).

Revision as of 18:43, 17 February 2017

2/6/2017 14:00-18:00 Set up wikiPage & remote desktop. Started working on python version of web crawler. So far it successfully prints out a catchphrase/ description for one website. To be worked on. The python file can be found in: E:\McNair\Projects\Accelerators\Python WebCrawler\webcrawlerpython.py

2/8/2017 9AM-11AM Attempted to come up with possible cases for locating the description of accelerators - pick up from extracting bodies of text from the about page (given that it exists)

2/13/2017 MONDAY 2PM- 6PM Goals (for trials): 1) Build ER Diagram 2) For each entity, get XML snippet 3) Build a parser/ripper for single file; the python parser can be found at: E:\McNair\Projects\FDA Trials\Jeemin_Project Trial Data Project

2/15/2017 WEDNESDAY 9AM-11AM Discussed with Catherine what to do with FDA Trial data and decided to have a dictionary with zip-codes as keys and number of trials occurred in that zipcode as values. Was still attempting to loop through the files without the code having to exist in the same directory as the XML files. Plan to write to excel via tsv, with zip-code as one column and # of occurrence as the other.

2/17/2017 FRIDAY 2PM-6PM Completed code for counting the number of occurrences for each unique zipcode. (currently titled & located: E:\McNair\Projects\FDA Trials\Jeemin_Project\Jeemin_Running_File.py). It has been running for 20+min because of the comprehensive XML data files. Meanwhile started coding to create a dictionary with the keys corresponding to each unique trial ID, mapped to every other information (location, sponsors, phase, drugs ...etc.) (currently titled & located: E:\McNair\Projects\FDA Trials\Jeemin_Project\Jeemin_FDATrial_as_key_data_ripping.py).