Difference between revisions of "Trial Data Project"
Jump to navigation
Jump to search
(Created page with "== Steps Followed to Extract the Trial Data == ===Extracting Data from XML Files === All the historical USPTO data is available as XML files. Here is the tree structure for...") |
|||
(33 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
+ | {{Project | ||
+ | |Has project output=Data,Tool | ||
+ | |Has sponsor=McNair Center | ||
+ | |Has title=Trial Data Project | ||
+ | |Has owner=Jeemin Sim, Catherine Kirby, | ||
+ | |Has project status=Complete | ||
+ | }} | ||
+ | ==Summary== | ||
+ | |||
+ | This project works out how to reprocess the Clinical Trial Data from ClinicalTrials.gov into structured and cleaned datasets. The data covers 239,638 studies from 2000 to present. | ||
+ | |||
+ | == Information Source == | ||
+ | * https://clinicaltrials.gov/ct2/resources/download | ||
+ | * https://clinicaltrials.gov/ct2/html/images/info/public.xsd | ||
+ | |||
== Steps Followed to Extract the Trial Data == | == Steps Followed to Extract the Trial Data == | ||
Line 5: | Line 20: | ||
All the historical USPTO data is available as XML files. Here is the tree structure for the XML files: | All the historical USPTO data is available as XML files. Here is the tree structure for the XML files: | ||
− | -<clinical_study rank="61205"> | + | -<clinical_study rank="61205"> |
− | + | +<required_header> | |
− | +<id_info> | + | +<id_info> |
− | <brief_title> | + | <brief_title> |
− | +<sponsors> | + | +<sponsors> |
− | <source> | + | <source> |
− | +<oversight_info> | + | +<oversight_info> |
− | +<brief_summary> | + | +<brief_summary> |
− | +<detailed_description> | + | +<detailed_description> |
− | <overall_status>Completed</overall_status> | + | <overall_status>Completed</overall_status> |
− | <phase>Phase 1/Phase 2</phase> | + | <phase>Phase 1/Phase 2</phase> |
− | <study_type> | + | <study_type> |
− | <study_design> | + | <study_design> |
− | <condition> | + | <condition> |
− | +<intervention> | + | +<intervention> |
− | +<eligibility> | + | +<eligibility> |
− | +<location> | + | +<location> |
− | +<location_countries> | + | +<location_countries> |
− | <verification_date> | + | <verification_date> |
− | <lastchanged_date> | + | <lastchanged_date> |
− | <firstreceived_date> | + | <firstreceived_date> |
− | <has_expanded_access> | + | <has_expanded_access> |
− | +<condition_browse> | + | +<condition_browse> |
− | +<intervention_browse> | + | +<intervention_browse> |
+ | |||
+ | |||
+ | Corresponding tables are: | ||
+ | *Study | ||
+ | *Location | ||
+ | *Sponsors | ||
+ | *Eligibility | ||
+ | *Dates | ||
+ | *MeSH (Medical Subject Headings) | ||
+ | |||
+ | ==== Study ==== | ||
+ | The corresponding nodes are: | ||
+ | -<id_info> | ||
+ | <org_study_id>NCRR-M01RR01070-0506</org_study_id> | ||
+ | <secondary_id>M01RR001070</secondary_id> | ||
+ | <nct_id>NCT00000102</nct_id> | ||
+ | </id_info> | ||
+ | |||
+ | <brief_title>Congenital Adrenal Hyperplasia: Calcium Channels as Therapeutic Targets</brief_title> | ||
+ | |||
+ | -<oversight_info> | ||
+ | <authority>United States: Federal Government</authority> | ||
+ | </oversight_info> | ||
+ | |||
+ | -<brief_summary> | ||
+ | <textblock> This study will test the ability of extended release nifedipine (Procardia XL), a blood pressure medication, to permit a decrease in the dose of glucocorticoid medication children take to treat congenital adrenal hyperplasia (CAH). </textblock> | ||
+ | </brief_summary> | ||
+ | |||
+ | -<detailed_description> | ||
+ | <textblock> This protocol is designed to assess both acute and chronic effects of the calcium channel antagonist, nifedipine, on the hypothalamic-pituitary-adrenal axis in patients with congenital adrenal hyperplasia. The multicenter trial is composed of two phases and will involve a double-blind, placebo-controlled parallel design. The goal of Phase I is to examine the ability of nifedipine vs. placebo to decrease adrenocorticotropic hormone (ACTH) levels, as well as to begin to assess the dose-dependency of nifedipine effects. The goal of Phase II is to evaluate the long-term effects of nifedipine; that is, can attenuation of ACTH release by nifedipine permit a decrease in the dosage of glucocorticoid needed to suppress the HPA axis? Such a decrease would, in turn, reduce the deleterious effects of glucocorticoid treatment in CAH. </textblock> | ||
+ | </detailed_description> | ||
+ | |||
+ | <overall_status>Completed</overall_status> | ||
+ | |||
+ | <phase>Phase 1/Phase 2</phase> | ||
+ | |||
+ | <study_type>Interventional</study_type> | ||
+ | |||
+ | <study_design>Intervention Model: Parallel Assignment, Masking: Double-Blind, Primary Purpose: Treatment</study_design> | ||
+ | |||
+ | <condition>Congenital Adrenal Hyperplasia</condition> | ||
+ | |||
+ | ==== Location ==== | ||
+ | The corresponding node is: | ||
+ | -<location> | ||
+ | -<facility> | ||
+ | <name>Medical University of South Carolina</name> | ||
+ | -<address> | ||
+ | <city>Charleston</city> | ||
+ | <state>South Carolina</state> | ||
+ | <country>United States</country> | ||
+ | </address> | ||
+ | </facility> | ||
+ | </location> | ||
+ | |||
+ | ==== Sponsors ==== | ||
+ | The corresponding node is: | ||
+ | -<sponsors> | ||
+ | -<lead_sponsor> | ||
+ | <agency>National Center for Research Resources (NCRR)</agency> | ||
+ | <agency_class>NIH</agency_class> | ||
+ | </lead_sponsor> | ||
+ | </sponsors> | ||
+ | |||
+ | ==== Eligibility ==== | ||
+ | The corresponding node is: | ||
+ | -<eligibility> | ||
+ | -<criteria> | ||
+ | <textblock> Inclusion Criteria: - diagnosed with Congenital Adrenal Hyperplasia (CAH) - normal ECG during baseline evaluation Exclusion Criteria: - history of liver disease, or elevated liver function tests - history of cardiovascular disease </textblock> | ||
+ | </criteria> | ||
+ | <gender>Both</gender> | ||
+ | <minimum_age>14 Years</minimum_age> | ||
+ | <maximum_age>35 Years</maximum_age> | ||
+ | <healthy_volunteers>No</healthy_volunteers> | ||
+ | </eligibility> | ||
+ | |||
+ | ==== Dates ==== | ||
+ | The corresponding nodes are: | ||
+ | <verification_date>January 2004</verification_date> | ||
+ | |||
+ | <lastchanged_date>June 23, 2005</lastchanged_date> | ||
+ | |||
+ | <firstreceived_date>November 3, 1999</firstreceived_date> | ||
+ | |||
+ | ==== MeSH ==== | ||
+ | The corresponding nodes are: | ||
+ | |||
+ | -<condition_browse> | ||
+ | <mesh_term>Hyperplasia</mesh_term> | ||
+ | <mesh_term>Adrenal Hyperplasia, Congenital</mesh_term> | ||
+ | <mesh_term>Adrenogenital Syndrome</mesh_term> | ||
+ | <mesh_term>Adrenocortical Hyperfunction</mesh_term> | ||
+ | </condition_browse> | ||
+ | |||
+ | -<intervention_browse> | ||
+ | <mesh_term>Nifedipine</mesh_term> | ||
+ | </intervention_browse> | ||
+ | |||
+ | == File locations == | ||
+ | The files/code that I have worked on all exist in this folder: E:\McNair\Projects\FDA Trials\Jeemin_Project | ||
+ | Trials per zipcode: | ||
+ | code: Jeemin_Trials_per_zipcode.py | ||
+ | output: Jeemin_Trials_per_zipcode_output.txt | ||
+ | General data ripping: | ||
+ | code: Jeemin_FDATrial_as_key_data_ripping.py | ||
+ | output: Jeemin_FDATrial_as_key_data.ripping_output.txt | ||
+ | |||
+ | == Tables == | ||
+ | Table 1: | ||
+ | row_headers1 = ['nct_id', 'brief title', 'oversight authority', 'brief summary', | ||
+ | 'detailed description', 'overall status', 'start date', 'completion date', 'phase', | ||
+ | 'study type', 'study design', 'condition', 'intervention type', 'intervention name', | ||
+ | 'eligibility description','eligibility gender', 'eligibility min age', | ||
+ | 'eligibility max age', 'verification date', 'lastchanged date', 'firstreceived date', | ||
+ | 'has expanded access'] | ||
+ | |||
+ | Table 2: | ||
+ | row_headers2 = ['nct_id', 'sponsor agency', 'sponsor class', 'lead or collaborator'] | ||
+ | |||
+ | Table 3: | ||
+ | row_headers3 = ['nct_id', 'facility name', 'city', 'state', 'zip', 'country'] | ||
+ | |||
+ | Table 4: | ||
+ | row_headers4 = ['nct_id', 'MeSH term'] | ||
+ | |||
+ | Table 5: | ||
+ | row_headers5 = ['nct_id', 'keyword'] |
Latest revision as of 13:44, 21 September 2020
Trial Data Project | |
---|---|
Project Information | |
Has title | Trial Data Project |
Has owner | Jeemin Sim, Catherine Kirby |
Has start date | |
Has deadline date | |
Has project status | Complete |
Has sponsor | McNair Center |
Has project output | Data, Tool |
Copyright © 2019 edegan.com. All Rights Reserved. |
Summary
This project works out how to reprocess the Clinical Trial Data from ClinicalTrials.gov into structured and cleaned datasets. The data covers 239,638 studies from 2000 to present.
Information Source
Steps Followed to Extract the Trial Data
Extracting Data from XML Files
All the historical USPTO data is available as XML files. Here is the tree structure for the XML files:
-<clinical_study rank="61205"> +<required_header> +<id_info> <brief_title> +<sponsors> <source> +<oversight_info> +<brief_summary> +<detailed_description> <overall_status>Completed</overall_status> <phase>Phase 1/Phase 2</phase> <study_type> <study_design> <condition> +<intervention> +<eligibility> +<location> +<location_countries> <verification_date> <lastchanged_date> <firstreceived_date> <has_expanded_access> +<condition_browse> +<intervention_browse>
Corresponding tables are:
- Study
- Location
- Sponsors
- Eligibility
- Dates
- MeSH (Medical Subject Headings)
Study
The corresponding nodes are:
-<id_info> <org_study_id>NCRR-M01RR01070-0506</org_study_id> <secondary_id>M01RR001070</secondary_id> <nct_id>NCT00000102</nct_id> </id_info>
<brief_title>Congenital Adrenal Hyperplasia: Calcium Channels as Therapeutic Targets</brief_title>
-<oversight_info> <authority>United States: Federal Government</authority> </oversight_info>
-<brief_summary> <textblock> This study will test the ability of extended release nifedipine (Procardia XL), a blood pressure medication, to permit a decrease in the dose of glucocorticoid medication children take to treat congenital adrenal hyperplasia (CAH). </textblock> </brief_summary>
-<detailed_description> <textblock> This protocol is designed to assess both acute and chronic effects of the calcium channel antagonist, nifedipine, on the hypothalamic-pituitary-adrenal axis in patients with congenital adrenal hyperplasia. The multicenter trial is composed of two phases and will involve a double-blind, placebo-controlled parallel design. The goal of Phase I is to examine the ability of nifedipine vs. placebo to decrease adrenocorticotropic hormone (ACTH) levels, as well as to begin to assess the dose-dependency of nifedipine effects. The goal of Phase II is to evaluate the long-term effects of nifedipine; that is, can attenuation of ACTH release by nifedipine permit a decrease in the dosage of glucocorticoid needed to suppress the HPA axis? Such a decrease would, in turn, reduce the deleterious effects of glucocorticoid treatment in CAH. </textblock> </detailed_description>
<overall_status>Completed</overall_status>
<phase>Phase 1/Phase 2</phase>
<study_type>Interventional</study_type>
<study_design>Intervention Model: Parallel Assignment, Masking: Double-Blind, Primary Purpose: Treatment</study_design>
<condition>Congenital Adrenal Hyperplasia</condition>
Location
The corresponding node is:
-<location> -<facility> <name>Medical University of South Carolina</name> -<address> <city>Charleston</city> <state>South Carolina</state> <country>United States</country> </address> </facility> </location>
Sponsors
The corresponding node is:
-<sponsors> -<lead_sponsor> <agency>National Center for Research Resources (NCRR)</agency> <agency_class>NIH</agency_class> </lead_sponsor> </sponsors>
Eligibility
The corresponding node is:
-<eligibility> -<criteria> <textblock> Inclusion Criteria: - diagnosed with Congenital Adrenal Hyperplasia (CAH) - normal ECG during baseline evaluation Exclusion Criteria: - history of liver disease, or elevated liver function tests - history of cardiovascular disease </textblock> </criteria> <gender>Both</gender> <minimum_age>14 Years</minimum_age> <maximum_age>35 Years</maximum_age> <healthy_volunteers>No</healthy_volunteers> </eligibility>
Dates
The corresponding nodes are:
<verification_date>January 2004</verification_date>
<lastchanged_date>June 23, 2005</lastchanged_date>
<firstreceived_date>November 3, 1999</firstreceived_date>
MeSH
The corresponding nodes are:
-<condition_browse> <mesh_term>Hyperplasia</mesh_term> <mesh_term>Adrenal Hyperplasia, Congenital</mesh_term> <mesh_term>Adrenogenital Syndrome</mesh_term> <mesh_term>Adrenocortical Hyperfunction</mesh_term> </condition_browse>
-<intervention_browse> <mesh_term>Nifedipine</mesh_term> </intervention_browse>
File locations
The files/code that I have worked on all exist in this folder: E:\McNair\Projects\FDA Trials\Jeemin_Project
Trials per zipcode: code: Jeemin_Trials_per_zipcode.py output: Jeemin_Trials_per_zipcode_output.txt General data ripping: code: Jeemin_FDATrial_as_key_data_ripping.py output: Jeemin_FDATrial_as_key_data.ripping_output.txt
Tables
Table 1: row_headers1 = ['nct_id', 'brief title', 'oversight authority', 'brief summary', 'detailed description', 'overall status', 'start date', 'completion date', 'phase', 'study type', 'study design', 'condition', 'intervention type', 'intervention name', 'eligibility description','eligibility gender', 'eligibility min age', 'eligibility max age', 'verification date', 'lastchanged date', 'firstreceived date', 'has expanded access']
Table 2: row_headers2 = ['nct_id', 'sponsor agency', 'sponsor class', 'lead or collaborator']
Table 3: row_headers3 = ['nct_id', 'facility name', 'city', 'state', 'zip', 'country']
Table 4: row_headers4 = ['nct_id', 'MeSH term']
Table 5: row_headers5 = ['nct_id', 'keyword']