ACARD (Automated Claims and Medical Record Databases) is a core in the Center for Clinical Epidemiology and Biostatistics (CCEB) created to foster epidemiologic research into therapeutics, using large population-based automated claims databases (Medicaid / Medicare) and medical records databases (GPRD™ and THIN), and an adverse event reporting system (AERS) database.
• Encourage investigators with diverse training to conduct studies focused on translational therapeutics and drug safety
• Provide access to large databases that allow for rapid and cost-effective research on therapeutics and drug safety
• Foster collaboration with CTSA sites outside of Penn who wish to use these databases for clinical research
"Also available is the FDA's Adverse Event Reporting System (AERS) database, which is a publically available resource that supports the FDA's post-marketing surveillance program. This database contains information on consumer, provider and manufacturer reported adverse events, potentially related pharmaceuticals, and outcomes when known."
"Lexicon Plus is a drug database useful for identifying national drug codes (NDCs) and drug prices for individual drugs and drug classes."
"The Medicaid data files are available to researchers directly from CMS (or its research data distribution center, Buccaneer Computer Systems & Service, Inc.). Files are typically requested on a per state, per year basis and involve the completion and submission of a new data use request packet to CMS. The specific Medicaid files that are available through ACARD and are stored on a dedicated server, are described further below."
"Sean Hennessy, PharmD, PhD, Associate Professor in the CCEB currently holds two NIH-funded R01s, and a RC1 utilizing Medicaid and Medicare data. These data are available to other researchers of the CTSA in accordance with rules set forth by CMS and policies established by the CCEB.
The following Medicaid and Medicare data reside in-house: 1999-2006 Medicaid claims (and supplemental Medicare claims for dual-eligibles) for beneficiaries of California, Florida, New York, Ohio, and Pennsylvania. These states have five of the largest Medicaid programs in the US and at any one time compromise approximately 23 million beneficiaries, or 39% of the entire Medicaid program."
"OptumInsight is clinically rich US health care claims database used to conduct research studies. OptumInsight accesses a comprehensive, large and robust proprietary healthcare database which has been capturing detailed healthcare data since 1993. The Optum’s database licensed by Penn, sourced by its parent company’s commercial ‘administrative services only’ and ‘fully insured’ populations, includes enrollment information from 1993 to 2012 on more than 90 million lives at any point and more than 13 million current lives, including clinical laboratory results to pharmacy and medical claims history. The completeness of this data allows the investigation of any number of patient, physician, treatment, and clinical attributes. Geographic diversity, combined with the longitudinal nature of these data and the ability to see treatments in many settings, affords a unique perspective to the researcher. Subjects can be studied longitudinally, as individuals who disenroll and re-enroll can be identified. This is considered to be a significant advantage of Optum data over most commercially-available databases that erroneously count multiple enrollments as distinct members."
"The Health Improvement Network (THIN) database includes the general practice electronic medical records for a sample of patients in the UK. Each patient in UK is registered with a general practitioner (GP) near their home and the GP determines if the patients can be treated in primary, or be referred to secondary care. Thus each patient has a longitudinal electronic medical record maintained by the GP. Following a referral by the GP to secondary care, the GP receives a summary of any diagnoses and required medical care. Because of this continuity of care, this medical system is especially useful for conducting long-term longitudinal observational studies.
As of November 2009, the THIN database contains the health records, from 464 practices around UK who use Vision software. The Vision software is an electronic medical record computer system used to record the details of each consultation, which may include problem heading, prescribed treatment, referral, tests and results. All medical conditions are recorded using Read Codes and lifestyle data (such as smoking) are also available. Some lifestyle data, such as diet and exercise, are limited. There are over 7.5 million patient records with acceptable recorded data, of which over 3 million are active patients. The population in THIN is representative of the national population in terms of age, gender, leading diagnoses and treatments. This consultation and prescribing characteristics for the THIN practices are similar to the national GP characteristics.
In addition to the main consultation being recorded, most patients in THIN Data are linked to postcode-level area-based socioeconomic, ethnicity and environmental indices. The data are based on the patient's postcode so that variables at ward level are available. This information has been collected for THIN without compromising patient confidentiality in any way."
"The UK General Practice Research Database (GPRD™) was started up in 1987 as the VAMP Research Data Bank. GPRD™ contains longitudinal data from the electronic medical records of patients from a large sample of general practices within the UK."
"In any given year, GPs who are members of the GPRD™ collect data on about 2- 3 million patients, yielding about 37 million person-years of follow-up between 1987 and March 2002 (the last year of data collection in the data set at Penn). Continuous information has been collected for 6 years or more in most of the practices. About 5% of the UK population is included in the GPRD™, which is broadly representative of the general UK population in terms of age, sex, and geographic distributions."
"EPIC licensed to Penn a static version of the GPRD™ (EPICGPRD™) which contains data from a total of 755 practices and has over 9 million patients with the most recent data collected through March 2002."
"In addition to access to the entire GPRD™ database, there is also a file containing data from a 10% random sample of the GPRD™ data. This dataset was created for convenience and more efficient use of these data.
Because of the large size of the GPRD™ database, investigators are encouraged to first do all the work using the 10% random sample file. Such tasks as developing the computer code to read the data and process the data, developing all study variables including recoded variables, developing the code for cohort selection and exclusion criteria, developing the code for getting frequency counts and other analyses should be tested using the data in the 10% random sample file. Once one is satisfied that the code correctly performs all these tasks in the smaller data file, this code can then be applied to the entire GPRD™ database to get the final study results. Of note, some studies may be possible to complete using only the 10% random sample."
"In addition to access to the entire THIN database, there is also a file containing data from a 10% random sample of the THIN data. This dataset was created for convenience and more efficient use of these data.
Because of the large size of the THIN database, investigators are encouraged to first do all the work using the 10% random sample file. Such tasks as developing the computer code to read the data and process the data, developing all study variables including recoded variables, developing the code for cohort selection and exclusion criteria, developing the code for getting frequency counts and other analyses should be tested using the data in the 10% random sample file. Once one is satisfied that the code correctly performs all these tasks in the smaller data file, this code can then be applied to the entire THIN database to get the final study results.
Additionally, there is a 1% sample of the THIN database available for students registered for course # EPID 633 – Advanced Database Management for Clinical Research. The course uses THIN as an example of a secondary database to load and test queries to generate analytic files for analysis. This course is recommended for students wishing to learn more about the structure of the THIN database."
"Investigators interested in conducting clinical, epidemiological, and pharmacoepidemiological research can obtain easy access to several large population-based automated databases that are available in-house at the Center for Clinical Epidemiology and Biostatistics (CCEB). These databases include Medicaid and linked Medicare claims data from five of the largest states in the US as well as medical records data from the practices of general practitioners in the UK. The cost and time needed to complete research based on analyses of secondary data such as these are much lower than what is required to collect data de novo. These databases include extensive information on diagnoses, treatments, procedures, and labs associated with patients' encounters with the medical establishment in well-defined population areas.
Also available is the FDA's Adverse Event Reporting System (AERS) database, which is a publically available resource that supports the FDA's post-marketing surveillance program. This database contains information on consumer, provider and manufacturer reported adverse events, potentially related pharmaceuticals, and outcomes when known.
In addition to access to the entire databases, there are also files containing data from 10% random samples of these data, created for convenience and more efficient use of these data, especially useful for preliminary testing of all study variables, methods, and analyses.
Data dictionaries and manuals describing each of the databases are available, as well as a large library of disease and drug codes that can be useful to help define exposure, outcome, and confounding variables.
Detailed descriptions of each of the databases and examples of published research based on these databases are provided as well."
• Provides access to the needed hardware and software in an environment supported by expert faculty supervision.
• Provide access to a library of statistical and programming resources for executing studies within the available electronic databases"
• Provide consultation during planning stages and during the conduct of their research;
• Assist investigators with preliminary feasibility queries (e.g., obtaining frequency counts of patients with a given disease or medication exposure);
• Assist investigators in applying for data use agreements where necessary;
• Assist applicants in writing research proposals to make sure the study is informed by in-depth understanding of the strengths and weaknesses of each data resource"