Browsing by Subject "Data Mining"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Defining Social Network Structure Through Text Similarity Analysis: A Model for Promoting Collaboration and Examining Conditions Impacting the Success of Collaborative Endeavors Within a Research Community(2007-05-22) Moser, Courtney Joy; Krumwiede, Kimberly HoggattGiven the breadth and sheer volume of accumulated scientific knowledge, individual researchers often lack the requisite knowledge and resources to adequately address increasingly complex problems; therefore, many researchers are realizing the advantages afforded by collaborative research practices. The application of text data mining technologies to social networking strategies provides a novel approach to identifying opportunities for scientific collaboration through text similarity analysis, provided by the computer program eTSNAP. The data set submitted to eTSNAP comprised 137 research abstracts representing individual scientists affiliated with the Regional Centers of Excellence in Biodefense and Emerging Infectious Diseases. Examination of the data in the form of tables, matrices, and interactive similarity network maps revealed the presence of eight discrete clusters of individuals, linked by the similarity of their abstracts. Further analysis of structural and functional characteristics of each cluster permitted the selection of a single cluster with the highest probability of collaborative success to serve as the pilot cluster. Members of this pilot cluster, renamed the "anthrax cluster" in reference to the common theme of research, received an introductory packet of information explaining the design of the project and soliciting participation in a preliminary survey, developed with intentions of assessing collaborative readiness and garnering practical information to assist in the preparation of a future teleconference. When multiple requests failed to elicit an adequate response, further attempts at establishing collaborative relationships between these researchers merely represented an exercise in futility. Evaluation of this project ultimately consisted of a secondary telephone interview with cluster members along with an in-depth literature review; both components of the final evaluation endeavored to isolate and examine factors that facilitate or inhibit collaboration within a research environment. Results suggest that similar interests alone cannot sustain successful collaboration; rather, complex interactions between a multitude of interconnected variables essentially determine collaborative outcomes.Item The IRIDESCENT System: An Automated Data-Mining Method to Identify, Evaluate, and Analyze Sets of Relationships Within Textual Databases(2003-02-01) Wren, Jonathan Daniel; Garner, Harold R.Individuals are limited in their ability to read, remember and compare relationships within the vast amount of scientific literature available. This is not only because the amount of literature is increasing exponentially, but the number of things being researched within is as well. Adding to the scale of analysis are new technologies that increase the rate by which data is being gathered from scientific experiments. For most areas of research interest, the scale of analysis exceeds an individual's ability to be aware of all the relationships contained within. Thus, an informatics approach is necessary to identify large-scale trends, shared relationships and novel relationships that are not contained within the literature, but are the logical consequence of the relationships that are. A system has been designed to establish a network of relationships between "objects" of research interest (e.g. genes, chemical compounds, drugs, diseases and clinical phenotypes) by extracting information from scientific text in an automated manner. This system, called IRIDESCENT (Implicit Relationship IDEntification by in-Silico Construction of an Entity-based Network from Text), enables the discovery of novel relationships by identifying and scoring objects sharing large sets of relationships with an object of interest. IRIDESCENT also allows sets of objects to be analyzed for shared relationships, such as responding genes from a microarray experiment. Herein is described the development and workings of IRIDESCENT as well as several well-developed applications of the system.Item Solid Organ Transplantation & Data Mining: Bloodstream Infections Have a Significant Impact on One-Year Survival and qSOFA ≥ 2 Predicts 3-Day Mortality(2018-01-23) Liu, Terrence; Xie, Donglu; Adams-Huet, Beverley; Le, Jade; Yek, Christina; Ranganathan, Dipti; Haley, Robert W.; Greenberg, David; La Hoz, RicardoBACKGROUND: We created a retrospective and prospective database of SOT recipients using innovative data mining tools. This study describing the epidemiology of BSI in SOT serves as a proof of concept of such techniques in clinical research. METHODS: The design of the study was a retrospective single center cohort study. Data mining tools were used to extract information from the electronic medical record and merged it with data from the Scientific Registry of Transplant Recipients (SRTR) national database. First SOT from 1/1/2010-12/31/2015 were included. Charts of subjects with positive blood cultures were manually reviewed and adjudicated using CDC/NHSN and SCCM/ESICM criteria. The 1-year cumulative incidence was calculated using the Kaplan-Meier method. Cox proportional hazards models were used to identify risk factors for BSI and 1 year mortality. BSI was analyzed as a time-dependent covariate in the mortality model. Fisher's exact test and Chi-Square were used to identify risk factors for 30-day mortality and multidrug resistant organisms (MDRO). RESULTS: 917 SOT recipients met inclusion criteria. 75 patients experienced at least one BSI. The cumulative incidence was 8.4% (95% CI 6.8-10.4). The onset of the 1st BSI episode was: 30 episodes (40%) < 1 month, 33 (44%) 1-6 months and 12 (16%) > 6 months. The most common pathogens were Klebsiella sp. (16%), Vancomycin-resistant E. faecium (12%), E. Coli (12%), CoNS (12%), and Candida sp. (9.3%). Nineteen isolates (25%) were identified as MDRO; the risk of MDRO was highest < 1 month compared to 1-6 and > 6 months (44.8 vs. 12.1 vs. 16.7 p=0.01). The most common source of BSI was CLABSI (29%). In multivariable analysis the risk of BSI was associated with organ type (HR [95%CI] = Multiorgan 3.5 [1.1-11.6], liver 2.5 [1.1-5.4], heart 2.4 [1.1-5.1]) and acquisition of a BSI was associated with a higher 1-year mortality (HR=8.7 [5.1- 14.7]). In univariable analysis, a polymicrobial BSI (14.7 vs. 57.1% p=0.02), qSOFA ≥ 2 (0.0 vs. 25.5% p=0.02) and septic shock (3.9 vs. 52.2% p<0.001) were associated with an increased risk of death at 30 days. CONCLUSION: A BSI significantly impacts the 1-year survival of SOT recipients. A qSOFA ≥ 2 can be used at the bedside to identify patients at increased risk for death. Additionally this study illustrates the potential of data mining tools to study infectious complications.