Project Topics

www.projecttopics.info

Engineering Projects

imed: Self Diagnosis System For Disease Treatment


Published on Sep 16, 2019

Abstract

The Machine Learning field has gained its thrust in almost any domain of research and just recently has become a reliable tool in the medical domain. The experiential domain of automatic learning is used in tasks such as medical decision support, medical imaging, protein-protein interaction, extraction of medical knowledge, and for overall patient management care.

ML is envisioned as a tool by which computerbased systems can be integrated in the healthcare field in order to get a better, wellorganized medical care. It describes a ML-based methodology for building an application that is capable of identifying and disseminating healthcare information. It extracts sentences from published medical papers that mention diseases and treatments, and identifies semantic relations that exist between diseases and treatments. Our evaluation results for these tasks show that the proposed methodology obtains reliable outcomes that could be integrated in an application to be used in the medical care domain. The potential value of this paper stands in the ML settings that we propose and in the fact that we outperform previous results on the same data set.

EXISTING SYSTEM:

The traditional healthcare system is also becoming one that hug the Internet and the electronic world. Electronic Health Records (EHR) is becoming the standard in the healthcare domain. Researches and studies show that the potential benefits of having an EHR system are: Health information recording and clinical data repositories immediate access to patient diagnoses, allergies, and lab test results that enable better and timeefficient medical decisions; Medication management rapid access to information regarding potential adverse drug reactions, immunizations, supplies, etc; Decision support the ability to capture and use quality medical data for decisions in the workflow of healthcare; and Obtain treatments that are tailored to specific health needs—rapid access to information that is focused on certain topics.

DISADVANTAGES OF EXISTINGS SYSTEM:

In order to embrace the views that the EHR system has, we need better, faster, and more reliable access to information.

All research discoveries come and enter the repository at high rate, making the process of identifying and disseminating reliable information a very difficult task.

PROPOSED SYSTEM:

The propose system approach, this work is to show what Natural Language Processing (NLP) and Machine Learning (ML) techniques what demonstration of information and what classification algorithms are suitable to use for identifying and classifying relevant medical information in short texts. We recognize the fact that tools able of identifying reliable information in the medical domain stand as construction blocks for a healthcare system that is up-to-date with the latest discoveries. In this examine, we focus on diseases and treatment information, and the relation that exists between these two entities. The approach used to solve the two proposed tasks is based on NLP and ML techniques. In a standard supervised ML setting, a training set and a test set are required. The training set is used to train the ML algorithm and the test set to test its performance.

ADVANTAGES OF PROPOSED SYSTEM:

The advantage that if a feature appears more than once in a sentence, this means that it is important and the frequency value representation will capture the feature’s value will be greater than that of other features.

MODULES:

1. Input progression

2. Medical Web domain

3. Tasks and Data Sets

I. The sentence identifies:

II. The relation identification:

4. CLASSIFICATION ALGORITHMS AND DATA REPRESENTATIONS

I. Bag-of-Words Representation

II. NLP and Biomedical Concepts Representation

III. Medical Concepts (UMLS) Representation

5. Output Performance

MODULES DESCRIPTION:

1. Input progression

The task is performing the input get the some word of medical sentence. The word contains matching word to search from the web domains. The rules are used to determine if a textual input contains relations. The sentences in which the relation appears and the local context of the entities.

2. Medical Web domain

The web domain storing medical web data bases. The data response the domain, then searching and reply data output to the user level. The domain of research and just recently has become a reliable tool in the medical domain. The experimental domain of automatic learning is used in tasks such as medical decision support, medical imaging, protein-protein interaction, extraction of medical knowledge, and for overall patient management care.

3. Tasks and Data Sets

The two tasks that are undertaken in this paper provide the basis for the design of an information technology framework that is capable to identify and disseminate healthcare information. The first task identifies and extracts informative sentences on diseases and treatments topics, while the second one performs a finer grained classification of these sentences according to the semantic relations that exists between diseases and treatments.

i. The sentence identifies: The sentences from Medline published abstracts that talk about diseases and treatments. The task is similar to a scan of sentences contained in the abstract of an article in order to present to the user-only sentences that are identified as containing relevant information.

ii. The relation identification: It has a deeper semantic dimension and it is focused on identifying diseasetreatment relations in the sentences already selected as being informative. This is mainly focus on three relations: Cure, Prevent, and Side Effect, a subset of the eight relations that the corpus is annotated with. We decided to focus on these three relations because these are most represented in the corpus while for the other five, very few examples are available.

The approach used to solve the two proposed tasks is based on NLP and ML techniques. In a standard supervised ML setting, a training set and a test set are required. The training set is used to train the ML algorithm and the test set to test its performance.

4. CLASSIFICATION ALGORITHMS AND DATA REPRESENTATIONS

In this module should be reliable at identifying informative sentences and discriminating disease- treatment semantic relations. The output need to be guided such that high performance is obtained. The experimental settings are directed such that they are adapted to the domain of study and to the type of data, allowing for the methods to bring improved performance.

i. Bag-of-Words Representation:

The bag-of-words (BOW) representation is commonly used for text classification tasks. It is a representation in which features are chosen among the words that are present in the training data. Selection techniques are used in order to identify the most suitable words as features. After the feature space is identified, each training and test instance is mapped to this feature representation by giving values to each feature for a certain instance.

ii. NLP and Biomedical Concepts Representation:

The second type of representation is based on syntactic information: noun-phrases, verb-phrases, and biomedical concepts identified in the sentences. In order to extract this type of information. Removing features that contain only punctuation, removing stop words, and considering valid features only the lemma-based forms. We chose to use lemmas because there are a lot of inflected forms for the same word and the lemmatized form will give us the same base form for all of them. Another reason is to reduce the data sparseness problem.

iii. Medical Concepts (UMLS) Representation

In order to work with a representation that provides features that are more general than the words in the abstracts, we also used the Unified Medical Language system12 concept representations. UMLS is a knowledge source developed at the US National Library of Medicine and it contains a met thesaurus, a semantic network, and the specialist lexicon for biomedical domain. The met thesaurus is organized around concepts and meanings; it links alternative names and views of the same concept and identifies useful relationships between different concepts.

5. Output Performance

The output perform the must be need of the exact web medical data and relational data’s get to here. We extracted only noun-phrases, verb-phrases, and biomedical concepts as potential features from the output of each sentence present in the data set.

SYSTEM FLOW DIAGRAM:

imed

SYSTEM REQUIREMENTS:

HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.

 Hard Disk : 40 GB.

 Floppy Drive : 1.44 Mb.

 Monitor : 15 VGA Colour.

 Mouse : Logitech.

 Ram : 512 Mb.

 MOBILE : ANDROID

SOFTWARE REQUIREMENTS:

 Operating system : Windows XP.

 Coding Language : Java 1.7

 Tool Kit : Android 2.3

 IDE : Eclipse








Related Projects