Pyhton Sample Projrct Code Developing Health Doctor Recommendation System Based on Doctors Reviews

Biomed Res Int. 2021; 2021: 7431199.

Physician Recommendation Model Based on Ontology Characteristics and Disease Text Mining Perspective

Chunhua Ju

^aneBusiness organization Administration College, Zhejiang Gongshang University, Hangzhou, People's republic of china

Shuangzhu Zhang

²School of Management Science and Engineering, Zhejiang Gongshang Academy, Hangzhou, Red china

Received 2021 May 21; Accepted 2021 Jul xx.

Data Availability Statement: The data were collected with help from the administrator of the WeiYi platform. Due to 3rd-political party rights, patient privacy, and commercial confidentiality, information is not open source.

Abstruse

Background

Patients can access medical services such as disease diagnosis online, medical treatment guidance, and medication guidance that are provided past doctors from all over the country at home. Due to the complication of scenarios applying medical services online and the necessity of professionalism of knowledge, the traditional recommendation methods in the medical field are confronting with problems such as low computational efficiency and poor effectiveness. At the same time, patients consulting online come from all sides, and virtually of them endure from nonacute or malignant diseases, and hence, there may be offline medical treatment. Therefore, this newspaper proposes an online prediagnosis dr. recommendation model by integrating ontology characteristics and disease text. Particularly, this recommendation model takes full consideration of geographical location of patients.

Objective

The recommendation model takes the existent consultation data from online as the research object, fully testifying its effectiveness. Specifically, this model would brand recommendation to patients on department and doctors based on patients' information of symptoms, diagnosis, and geographical location, as well as doc's specialty and their section.

Methods

Utilizing crawler technique, five hospital departments were selected from the online medical service platform. The names of the departments were in accordance with the standardized department names used in real hospitals (due east.g., endocrinology, dermatology, gynemetrics, pediatrics, and neurology). Every bit a issue, a dataset consisting of 20000 consultation questions by patients was built. Through the application of Python and MySQL algorithms, replacing semantic lexicon retrieval or word frequency statistics, discussion vectors were utilized to measure similarity between patients' prediagnosis and doctors' specialty, forming a recommendation framework on medical departments or doctors based on the above-obtained sentence similarity measurement and providing recommendation advices on intentional departments and doctors.

Results

In the online medical field, compared with the traditional recommendation method, the model proposed in the newspaper is of higher recommendation accuracy and feasibility in terms of department and doctor recommendation effectiveness.

Conclusions

The proposed online prediagnosis doctor recommendation model integrates ontology characteristics and disease text mining. The model gives a relatively more than accurate recommendation advice based on ontology characteristics such as patients' description texts and doctors' specialties. Furthermore, the model also gives full consideration on patients' location factors. As a result, the proposed online prediagnosis doctor recommendation model would improve patients' online consultation experience and offline treatment convenience, enriching the value of online prediagnosis data.

1. Introduction

As the emphasis of medical care gradually shifts from disease to patient, the role of patients' participation in online health improvement is becoming more prominent. The health service in the world is not only different in terms of regions simply also varying in terms of online wellness services [one, ii]. Specifically, there be phenomenon such equally information asymmetry betwixt doctors and patients and unequal distribution of medical resource geographically [three]. Therefore, patients registering doctors online and intelligent department recommendation accept also become one of the of import topics of medical informatization. According to a report released in 2019 by the Big Data Research Institute, the calibration of users in China's medical and wellness marketplace was nearly 800 one thousand thousand by the end of 2018 [four]. With a large number of doctors and patients interacting online, a large amount of real consultation data has been accumulated in the online health community. Therefore, it is of of import theoretical and practical value to investigate how to brand full use of online data to build models to improve patients' medical treatment experience in terms of increasing the accuracy of patients' medical choice and the effectiveness of department recommendation.

The existing literature has been conducting studies from perspectives of section recommendation and md recommendation. The two methods of department recommendation are separately based on good organisation and similarity calculation. Equally for department recommendation based on the good system, on one hand, through institution of medical knowledge base with the assist from medical experts, the diagnosis procedure of medical experts is simulated by applying rule-based reasoning engine. As a upshot, patients' diseases are predicted, and so as to achieve the target department recommendation for patients. Moreover, the skilful-based department recommendation is congenital upon fuzzy logic and RBF neural network, effectively improving the recommendation accurateness [5, 6]. On the other paw, there be many bug due to the abundant number of reasoning rules, such as low computational efficiency and high maintenance price of knowledge base. Every bit for department recommendation based on similarity calculation, the current literature uses various methods to measure similarities, such as similarity between patients' symptoms and disease' symptoms [7], TF-IDF sentence-based similarity and TF-IDF algorithm that is based on multiple words [8, ix], combination of focus shifting backwards, and professional medical corpus [10]. This similarity-based recommendation would, respectively, calculate the possibility of having affliction and descriptive words that may correspond with sure symptoms, realizing the goal of section recommendation to patients. Enquiry of recommendation on physician is mainly based on the content and collaborative filtering recommendation algorithm, focusing on user keywords, browsing history, evaluation, and other information [eleven, 12]. The user collaborative filtering algorithm assumes that one user and other user group who share similar involvement would accept same production preference [xiii–fifteen]. Among them, user collaborative filtering algorithm integrating projects mainly solves the problem of information overload through filtering aspect collaboratively [sixteen]. Moreover, the awarding of customized relational network and tags solves the trouble of data sparsity in the matrix factorization recommendation model [17, xviii], and the collaborative filtering recommendation method integrates contextual perception, project similarity, and user behavior, giving recommendation results from perspectives of patients' contexts, projects, and user participation [19–21]. In addition, scholars likewise conducted modeling research on doctor recommendation, disease diagnosis, and medical test [22, 23] from the perspectives of semantic characteristics of medical resources [24], user data types [25], user ratings, and comment portraits [26], as well as Bayesian algorithm [27].

The recommendation algorithms in the traditional medical field mainly have the post-obit three problems. Outset, in terms of department recommendation, the algorithm based on the proficient organisation causes problems such every bit explosion of noesis rule reasoning and loftier maintenance toll of noesis base. Furthermore, the algorithm based on similarity may not finer recognize synonyms, perhaps decreasing recommendation accuracy. Second, in terms of physician recommendation, the user-based collaborative filtering algorithm may cause issues that patients of similar symptoms would not be diagnosed with the same disease, due to complication and diversity of diseases. What is more, because of the nonnecessary relationship among patients' etiologies, the assumption of the project-based collaborative filtering algorithm that users would choose doctors with the aforementioned enquiry field as their previous doctors may hardly be met. Third, although relevant literatures take studied how to reduce data sparsity [28–30], the collaborative filtering recommendation algorithm however cannot completely avoid the performance problems caused by data sparsity.

Based on the in a higher place theorization, it tin can be concluded that the existing recommendation algorithms cannot fully meet requirements with regard to recommendation in the context of the Internet medical field. Patients can access medical services provided by doctors in the online wellness community all over the country online without going out, including disease diagnosis, medical treatment guidance, and medication guidance. Meanwhile, patients consulting online come up from far and near and may involve situations of offline medical treatment, making it necessary to take into business relationship the factor of patients' location. Therefore, this newspaper proposes an online prediagnosis doctor recommendation model that integrates ontology characteristics and affliction text mining, improving both the effectiveness of physician recommendation within the environment of online medical service and the convenience of offline medical treatment for patients.

ii. Enquiry on the Doctor Recommendation Model

The doctor recommendation model is mainly divided into three steps. Step one: information preprocessing. Perform word sectionalization and end word removal with regard to patient's input of tongue. Footstep two: hospital department recommendation. After screening patients' query data, create the most similar judgement set based on key parts of give-and-take vector or the similarity measurement for symptom descriptions, so every bit to accomplish department recommendation. Step 3: md recommendation. Employ SQL sentence query in the MYSQL database to consummate doctor recommendation (Figure i).

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.001.jpg

Prediagnosis physician recommendation model integrating ontology characteristics and disease text mining.

3. Data Cleaning Process

There are mainly ii aspects of data that are bachelor online. The offset aspect of data is patients' online consultation regarding disease symptom. This source of data mainly covers age, gender, symptom description, and other data. The 2nd aspect of data is doctors' information online, including doctors' names, titles, hospitals, departments, and their specialties equally shown in Table 1. All data is in structured form, and information such equally disease description, prediagnosis, and specialties are stored in text form. So, model will be built after give-and-take partitioning and keyword extraction (Figure two).

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.002.jpg

Table i

Information sample on patients and doctors online.

Patient ID	Gender	Age	Province/city	Chief complaint	Initial consultation department online
8070844	Female	65	Jiangsu	Menstruation keeps coming. B-ultrasound result shows that my endometrium is thick. I ate progesterone and did curettage. For now, I accept been taking medicines for 10 days. 3 days afterwards progesterone, I still had large corporeality of claret menstruation, and my breadbasket ached. I am wondering what is incorrect with me.	Gynecology
81305510	Female person	42	Guangdong	Bilateral hydrosalpinx. I never had abortion history. I want to be pregnant now, what should I exercise now?	Gynecology
12031251	Female person	43	Heilongjiang	43-twelvemonth-erstwhile, irregular menstruation for many years, iii times for 2 months, the catamenia was long for 7/8 days, the corporeality is trivial, and the color is dark brown. What medicine should I take?	Gynecology
57715499	Female person	37	Henan	Just had miscarriage a month ago; yet, I got pregnant in confinement. Can I go along the child?	Gynecology
72520784	Female	53	Shanghai	My mother is 53 years erstwhile. She feels nervous, unable to breathe, cannot prevarication downwardly, and feels no strength.	Neurology
Doctor proper noun	Title	Hospital	City	Specialties	Department
Niu^∗∗	Main Physician	Ningbo First Hospital	Ningbo	Diagnosis and handling of diabetes and thyroid illness	Endocrinology
Yang^∗	Associate Main Physician	Shijiazhuang First Hospital	Shijiazhuang	Hemorrhagic cerebrovascular affliction such equally cognitive aneurysm, arteriovenous malformation, arteriovenous fistula, and cavernous hemangioma; ischemic cerebrovascular diseases such as carotid avenue stenosis, vertebral avenue stenosis, intracranial artery stenosis,and moyamoya affliction	Neurosurgery
Xu^∗∗	Chief Physician	Beijing Anzhen Hospital, Capital Medical University	Beijing	Diagnosis, surgical treatment, and perioperative treatment of various congenital heart diseases	Pediatric cardiac surgery
Wang^∗∗	Acquaintance Chief Physician	Shenzhen Bao'an People'southward Hospital	Shenzhen	Diagnosis and treatment of diabetes and its complications, hyperthyroidism, and hypothyroidism; utilise of insulin pump and dynamic claret glucose monitors	Endocrinology
Liu^∗∗	Master Physician	Infirmary of Traditional Chinese Medicine in Uygur, Xinjiang	Xinjiang	Neurology of traditional Chinese medicine	Neurology

4. Information on Ontology Characteristics of Doctors and Patients

The doctor-patient demographic data obtained from WeiYi platform are mostly well-organized semistructured textual data. The commencement step is to transform unstructured text data into structured text information through named entity recognition and data extraction. Organisation names, people'southward names, and location names tin can be recognized past applying multiple open source Chinese language processing tools [31], such every bit fudanNLP developed by Fudan University [32], NLPIR discussion segmentation organisation developed by Chinese University of Sciences [33], and LTP Chinese tongue processing platform of Harbin Found of Technology [34]. In addition, delete the missing value and duplicated information. And, for the problem of different doctors sharing one aforementioned name, apply fields such as "the hospital to which they vest" and "the section to which they belong" to restrict.

5. Data on Patients' Condition Description

Data on patients' online condition clarification are presented as specific evaluations expressed past patients in natural linguistic communication. The information in its initial course are fulfilled with bug that the contents are nonstandardized, repetitive, brusque, and single [35]. The authors marked the text content by part of speech and synonyms and then use human tissue lexicon and human anatomy lexicon to friction match the word segmentation results and so as to excerpt disease symptoms and keywords of human body parts. Every bit shown in Tabular array one, the patient'south main complaint was that "it was caused past pelvic effusion viii years ago, there was no abortion history and no pregnancy." The common clinical symptoms that the patient did not really accept appeared in the description brand it difficult to extract keywords. For case, "no ballgame history " was divided into "no" and "abortion history," resulting in the extraction of " abortion history " as the keyword; yet, the patient did non have these symptoms. To deal with situations similar the abovementioned, before word segmentation, the authors would divide the description paragraph into short sentences or phrases by punctuation marks, and the finish words should exist retained in word segmentation. Then, while extracting keyword, the target words cannot be considered as the existent target keywords if they contain negative modifiers such every bit none, unaccompanied, and no.

6. Data on Doctors' Specialties

Information on doctors' specialties are structured textual data and are confronted with problems of synonymous naming and missing data. An example of synonymous naming refers to the problem that doctors in different hospitals have unlike naming for their fields of expertise. Specifically, synonyms for fields of expertise are specialties, being skilful at, specializing in, beingness skilled in, being professional with, medical interest, and inquiry direction. All synonymous naming shall be integrated into the aforementioned field. As for the problem of missing data, utilize multiple data source data integration to complete comeback or deletion.

seven. Dr. Recommendation

seven.1. Department Recommendation

For questions input by patients, every keyword for each sentence can be obtained after word segmentation and word stopping removal. Side by side, the corresponding question set can be obtained by positioning question sentences that are associated with each keyword. The authors divided the question set into sample dataset and test dataset, both containing information of patients' condition description text, online prediagnosis department recommendation, etc. Then, use the word2vec library to train a word vector model on the keywords of the sentences in the sample data prepare, calculate the similarity between the questions input by the patient in the test data set and the word vector model of the sample data set, and lastly select the about like questions to the sample data set in the test dataset. Following the rule that higher similarity indicates the aforementioned one department, afterward screening the similarity calculation one by one, the department with the highest similarity would be the final recommendation consequence.

8. Doc Recommendation

The core significance of the development of online medical and health services is to reshape the medical service procedure and optimize the allocation of medical resources, so as to meet the medical and wellness needs of individual consumers. Due to its mobility, convenience, rapidness, personalization, and interaction, the online medical services accept become the chief channel for consumers to seek medical help online, having been adopted and utilized by consumers. To some extent, it alleviates the medical pressure and realizes the optimal allocation of medical resources. The patients using online medical service come up from all sides, and the majority of them take conventional and chronic diseases, making it sometimes necessary for patients to confirm their diagnosis offline. Therefore, doctor recommendation that takes into business relationship of patients' location information is particularly of import to amend patients' convenience of offline medical treatment and to attract more than patients to employ online medical services. Based on the SQL statements query function in the MYSQL database, matching keywords with doctors' specialties, department, and region information, integrating patients' location information, and this paper recommends local doctors that run across the requirements according to patients' region. For instance, a patient's naming Zhang San, living in Zhejiang province, with condition described as thick endometrium, heavy menstrual flow, and stomachache, would be recommended to see a Primary Physician from Department of Gynecology at Zheyi hospital with family proper noun of Wang.

nine. Sentence Similarity

nine.1. Adding of Similarity Based on Postcontent

Subsequently obtaining the unique d-dimensional distribution vector representation of the disease description text content, the similarity and distance between each ii text contents can be obtained through similarity calculation. The author uses the cosine formula to measure the similarity betwixt two texts and uses the Mahala Nobis distance to calculate the natural language description of the two posts. Assume that two paragraph vectors of natural language description of text content are expressed every bit PV_a = (×11, ×12, ⋯, ×1d) and PV_b = (×21, ×22, ⋯, ×2d), where d represents ii paragraph vectors. The similarity and altitude are defined as follows:

$\begin{matrix} \begin{matrix} sim (PV a, PV b) = \frac{PV d • PV d}{{‖PV d‖}^{ii} • {‖PV d‖}^{2}}, \\ = \frac{\sum_{i - 0}^{i = d} x 1 i x 2 i}{\sqrt{\sum_{i - 0}^{i = d} {ten}_{1 i}^{2} \sqrt{\sum_{i - 0}^{i = d} {ten}_{two i}^{2}}}}, \end{matrix} \\ dis (PV a, PV b) = \sqrt{{(PV a - {PV}_{b})}^{T} {Due south}^{- one} (PV a - PV b)}, \end{matrix}$

(one)

where S is the covariance matrix of eigenvectors PV _a and PV _b .

9.2. TF-IDF Judgement Similarity Based on Co-Occurring Words

This method believes that in ii sentences, the more than the aforementioned vocabulary, the higher the similarity of the two sentences ^[36]. Specifically,

$\begin{matrix} SimScore (South one, S ii) = \frac{|S 1 \cap Due south 2|}{|Due south 1 \cup Due south ii|} \sum_{west i \in Due south 1 \cap S two} w eight (westward i), \\ weight (west i) = \frac{Num (w i, m)}{North thou} \times \log (\frac{Due north t}{Num (w i, t) + 1}) . \end{matrix}$

(ii)

Among them, |·| is the cardinality of the set, S ₁ and South ₂ are the word sets of the two sentences to be compared, w _i represents the symptom word i in the department question and respond sentence, weight (due west _i) is the TF-IDF ^[37] weight, Num (w _i,k) represents the number of sentences in which the symptom give-and-take westward _i appears in the question and answer sentence set of department k, N _thousand represents the number of all questions and answers in section g, Due north _t represents the full number of questions and answers in the knowledge base of operations, and Num (wi, t) represents the total number of questions and answers in the cognition base. The number of sentences in which the symptom give-and-take i appears in the question. The TF-IDF judgement similarity adding method based on co-occurring words belongs to the surface structure analysis method. It simply uses the surface information of the sentence, that is, the word frequency, part of speech, and other information of the words in the sentence to calculate the judgement similarity, without because synonyms. This results in a decrease in the accuracy of sentence similarity.

9.3. Sentence Similarity Method Based on Give-and-take Vector

Word vector sentence similarity is mainly used indepth learning tool word2vec ^[38] to process words into vectors and obtain the semantic similarity of sentence pairs to be compared by calculating the similarity between vectors. The specific formula is as follows:

$\begin{matrix} \underset{due west i \in I, westward j \in R}{CosSim} (w i, w j) = \frac{\sum_{i = 1}^{n} (x i, y i)}{\sqrt{\sum_{i = 1}^{due north} x_{i}^{ii}} \times \sqrt{\sum_{i = ane}^{n} y_{i}^{2}}}, \\ SimScore (South 1, South 2) = \frac{\sum_{w \in IR} β west G axSimValue (CosSim (w, IR))}{\sum_{w \in IR} β w} . \end{matrix}$

(3)

Among them, IR = S ₁ ∪ Southward _ii, due west _i and westward _j are the two words to exist compared, which represent the words in judgement S ₁ and the words in judgement Southward _ii, respectively; n represents the dimension of the word vector, and x _i and y _i correspond the discussion vector of due west _i, and the vector value of the ith dimension of the word vector of due west _j; MaxSimValue (CosSim (w,·)) represents the maximum value of the cosine similarity betwixt the discussion vector corresponding to discussion w and the word vector respective to all vocabulary of some other sentence; parameter βw is The TF-IDF weight value of word westward in the sentence. The greater the value of SimScore (S1, S2), the greater the similarity between the ii sentences and the closer the semantics.

10. Experiment

x.i. The Information Set

To clarify the doctor recommendation method proposed in this paper, an experimental study was conducted. The data of five most common departments were crawled from the well-known domestic medical online platform-WeiYi. The names of the departments were in accordance with the standardized department names used in real hospitals (east.g., endocrinology, dermatology, gynemetrics, pediatrics, and neurology). Equally a upshot, a dataset with name of T consisting of 20000 patients' preclinical data online were built. To conduct experimentally comparative analysis of various algorithms, 2 widely used evaluation indexes for the recommendation performance were adopted in this paper, being accuracy charge per unit (being P) and call back rate (being R):

10.2. Parameter Setting

In the experiment, the dimension parameter of the word vector was set as 100. With regard to the calculated similarity results of keyword fix that would be used for section recommendation, take the acme five questions with the highest sentence similarity every bit the recommended issue data (topN = top 5), and the threshold value of keyword set similarity was set as 0.8; that is, when keyword and test set information were used for keyword similarity adding, the event must exceed 0.8 to be included in the hospital department recommendation set. If there were 2 or more than recommended hospital departments, information technology would exist considered as no recommendation, beingness a special case.

11. Results and Analysis

Among the 20000 patients surveyed, 16170 were female (77.3%). This may exist because women are often required to care of family health and other responsibilities in improver to work; also, women tend to pay more than attending to wellness information than men. A full of 16800/20000patients (84.0%) were xxx to 45years of age. Because of one-time men with express experiences in consulting physicians and obtaining medicines and children that cannot primary online counseling skills, and then, old men and children may non oftentimes consult physicians on the internet or ask their family members to perform online inquiries. In the 20000 records, 12600 of the physicians (63.0%) are chief physicians or associate chief physicians, while19400 hospitals (97.0%) were ranked 3A (encounter Table 2).In order to verify the feasibility and effectiveness of the proposed recommendation algorithms for department and doctor, the experiment was conducted to compare them with the content-based recommendation algorithm and user-based collaborative filtering algorithm. First, randomly excerpt 100 pieces of data from the dataset T based on the hospital department name and then perform discussion vector preparation. Later the process of word segmentation and stop give-and-take removal for information of different departments, the keyword set was obtained, and the word vector model was trained using this keyword set (see Tabular array 3). The give-and-take vector model consisted of patients' real consultation questions, and the other words excluding those questions within the group were considered every bit noise words, representing meaningless words unrelated to patient's consultation. Three different algorithms were all used to measure similarity for keywords to give infirmary department recommendation (see results of iii algorithms in Table 4).

Tabular array two

Summary of the characteristics of the nerveless data records (North = 20000).

Characteristic	Value, n (%)
Gender
Male person	4540 (33.vii)
Female	15460 (77.3)
Age (years)
25-30	1586 (7.9)
31-45	16800 (84.0)
46-50	1014 (v.1)
>55	600 (3.0)
Physician's professional championship
Resident physician	2670 (13.35)
Attending physician	4330 (21.65)
Acquaintance chief physician	8040 (40.two)
Primary dr.	4560 (22.8)
Other	400 (two.0)
Hospital'south ranking level
3A	19400 (97.0)
Other	600 (3.0)

Tabular array 3

Give-and-take vector model and keyword examples.

Discussion vector-based model	Keyword set	Department
Discussion vector-based model	Headache, nausea, right eye, swelling, stuffy nose, right ear, tinnitus, etc.	Neurology
Keyword set	1. Migraines, nausea, loss of appetite 2. Headache, dizziness, protrusion of left eye, congestion of eyeball three. Caput amplification, stuffiness, dizziness, palpitation, and restlessness 4. Palpitations and palpitations 10. Weak correct paw, unable to clamp a fist, palpitation, unable to breathes

Table 4

Comparison of accurateness and call up rate.

Algorithm method	Accuracy rate (%)	Recollect rate (%)
Give-and-take vector-based	74	78
Content-based	63	67
Co-occurring word-based	54	56

Seen from Table 4, the proposed similarity recommendation method in this paper that incorporates ontology features and disease text data mining was the best when applied to consultation about selecting appropriate infirmary department since the accuracy rate and recall rate were much higher than the other two algorithms. This is considering the word vector sentence similarity measurement strategy can better measure the semantic similarity of sentences. For case, for sentence pairs "I went to the hospital to see the dentist and went home, light-headed, heavy caput, runny nose" and "When I came back from the dentist, I started to experience Dizziness with symptoms of heavy head and runny nose". If a co-occurring word-based measurement method based on co-occurrence words is used, the similarity value is depression, because the sentence pair contains such things as (dizziness, dizziness), (heavy caput, sinking head), and (runny olfactory organ, runny olfactory organ). Synonym pairs such as articulate nose) make the content-based method relatively skillful, and the word vector method has the best event, indicating that information technology can more accurately capture the underlying semantics of the sentence. On one hand, this is because the method in this paper can measure the similarity of keywords better. For instance, keywords of "headache, palpitation, indisposition" and keywords of "head amplification and restlessness" were considered as similar. The results were better than the judgement similarity measurement based on collocates. On the other mitt, the proposed method in this newspaper took fully consideration of factors such equally location information of doctors and patients, too equally doctors' expertise field, which would non be the case for the content-based recommendation method that merely takes the patient's disease information into business relationship.

Seen from Figures 3 and 4, the recommendation functioning of the word vector method was varying for dissimilar hospital departments. The recommendation accuracy of pediatric section was beneath 0.v, and that of neurology, endocrinology, gynecology, and dermatology departments were all above 0.5, among which the recommendation accurateness of gynecology was the virtually improved. With regard to the four departments with relatively higher recommendation accurateness, including neurology, obstetrics, gynecology, and dermatology, what they had in common was that the characteristics of the consultation questions were very typical and obvious. For instance, high blood carbohydrate, sudden weight loss, and thirst are typical for endocrinology; red rash, round rash, redness, swelling, and itching are typical for dermatology; pregnancy and irregular menses are typical for gynecology. Yet, the situation is different for pediatric department in that if data indicating age such as baby, child, and 6 months old is not included in the consultation, it may lead to the systematic recommendation to other departments, reducing the accuracy accordingly.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.003.jpg

Recommendation accuracy comparing of dissimilar departments.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.004.jpg

Comparing of recommendation rates of various departments.

Finally,The SQL statement query role in the MYSQL database used to integrate the patient'south regional factors. Co-ordinate to the patient'southward region, we utilize the department and regional keyword matching and recommend the doctors in the hospital to patient in the region that see the needs, such as "Zhang San, from Zhejiang, the condition is described as uterus Thick intima, heavy menstrual flow, and stomachache," and the recommended doctor is "Zhejiang First Hospital-Gynecology-Dr. Wang (Chief Doc)." The process is shown in Effigy five.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.005.jpg

Doctor recommendation framework.

12. Conclusion

Traditional manual medical guidance is increasingly unable to meet the people'southward medical needs, registration is difficult, and the problem of non finding a clinic has get increasingly prominent. Aiming at the shortcomings of traditional medical department recommendation research methods and factors such every bit the necessity for professional medical diagnosis expertise and information asymmetry between doctors and patients makes it incommunicable for patients to identify the advisable dispensary room or doctors. One time mistakes are fabricated, online consultation fourth dimension would be wasted, increasing the toll of hospitals and patients when the patient goes offline instead for medical treatment. In this paper, the proposed online prediagnosis dr. recommendation model integrates ontology characteristics and disease text mining. The experimental process uses real data on the Net medical comprehensive website and is like to the sentence based on content based, and based on collocate based is compared; the experiment verifies the reliability and effectiveness of the method in this paper. This provides great convenience for patients to seek medical treatment and at the same time reduces medical costs. It gives a relatively more accurate recommendation advice based on ontology characteristics such as patients' description texts and doctors' specialties. As a result, the proposed online prediagnosis md recommendation model improves patients' online consultation experience and offline treatment convenience, enriching the value of online prediagnosis data. In addition, the primary existent data from the online medical consultation platform were utilized to verify the reliability and effectiveness of the proposed method.

13. Limitations

It is not without limitation in this paper. Starting time of all, this report was only carried out based on data from ane online medical community, rendering its generalizability a question. Future report may consider collecting data from multiple online medical community platforms to verify the recommendation effect of the proposed algorithm. Second, considering that this study is solely focused on the proposed recommendation model for Chinese patients, like studies shall be carried out in Western groundwork in the time to come. Tertiary, because of the complexity of the medical domain knowledge, follow-upward researches shall not only incorporate techniques such every bit semantic analysis and sentiment assay to expand the sample into general practice data merely besides consider introducing users' other behavioral data to introduce the user information behavior cistron optimize the target object, for intelligent section recommendation tasks, in addition to controlling data quality and deep learning algorithms such every bit LSTM shall exist applied to improve model accuracy in the hereafter. The intelligent department recommendation chore can also be abstracted as a multilabel classification task for texts. Accordingly, multiple department categories tin can be recommended for patients' questions roofing multiple departments, etc. to further amend the accuracy of the proposed recommendation model, expecting to employ it to more online medical consultation platforms.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.alg.001.jpg

This module preprocesses the sample dataset using the post-obit code. The aim is to segment words, remove end words, and retain central parts or key symptoms with regard to patients' condition description online.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.alg.002.jpg

This module used the word2VEC library to train the word vector model of dermatology on sample data such as "dermatology. XLS."

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.alg.003.jpg

The module mainly had ii goals to achieve. First, preprocess the test data, including word sectionalisation and end discussion removal, and retaining key parts or symptoms for the affliction description. Second, compare the word vectors of test data and that of the training results, and the departments with high similarity were recommended to patients.

Acknowledgments

This project was funded past grants from the National Natural Scientific discipline Foundation of China: Enquiry on Consumer Credit Value Measurement Integrating Online Social Relationships in eCommerce (71571162). The data were collected with aid from the ambassador of the WeiYi platform. The data were collected with help from the ambassador of the WeiYi platform.

Information Availability

The data were nerveless with aid from the ambassador of the WeiYi platform. Due to third-party rights, patient privacy, and commercial confidentiality, data is not open source.

Ethical Approval

The data in this paper is divided into two parts. 1 role is the information crawled from the platform, such equally patient comments and doctor profiles. This kind of data is open up to the public and everyone tin apply computer technology to obtain it on the platform. The other part is the patient'due south age, gender, geographical location, and other information provided past the microdoctor. The WEI-Yi platform is 1 of the hundreds of online medical platforms in Prc, with tens of thousands of registered hospitals, registered doctors, and hundreds of thousands of patients using the platform. The platform itself has a sound risk control organization, and we have as well signed a confidentiality agreement with the platform to define the scope of information utilise.

Disclosure

The newspaper was published in a reduced version at the IEEE 6th International Conference on Big Data Analysis (ICBDA) in 2021.

Conflicts of Interest

The authors declare that they accept no conflicts of interest.

Authors' Contributions

SZ and CJ refined the topics and methods at the initial stage of newspaper writing. Then, SZ conducted the statistical analysis and wrote the paper under the guidance of CJ. Both authors reviewed, revised, and approved the final draft.

References

1. Balarajan Y., Selvaraj S., Subramanian Southward. Health care and equity in India[J]Health care and disinterestedness in India. The Lancet. 2011;377, article 9764:505–515. doi: 10.1016/S0140-6736(10)61894-6. [PMC gratuitous article] [PubMed] [CrossRef] [Google Scholar]

ii. Goh J. Thou., Gao Thou., Agarwal R. The creation of social value: Can can an online health community reduce rural-urban health disparities? MIS Quarterly. 2016;40(one):247–263. doi: 10.25300/misq/2016/forty.i.11. [CrossRef] [Google Scholar]

3. Pan J., Shallcross D. Geographic distribution of infirmary beds throughout Prc: a county-level econometric analysis. International Journal for Equity in Health. 2016;fifteen(i):p. 179. doi: 10.1186/s12939-016-0467-9. [PMC costless article] [PubMed] [CrossRef] [Google Scholar]

5. Bo H. Pattern and realization of AISCP guiding system congenital in knowlege base. SUZHOU:SoochowUniversity; 2006. [CrossRef] [Google Scholar]

6. Ru H. In: The blueprint and implementation of the guidance system based on the reasoning algorithm. FEI H. E., editor. Anhui University; 2016. [CrossRef] [Google Scholar]

8. Ju C., Zhang S. Research on md recommendation model for Pre-Diagnosis online based on Large data Mining. 2021 IEEE sixth International Conference on Big Information Analysis (ICBDA 2021); 2021. [Google Scholar]

ix. Chuan-Peng C., Zhi-Gang W. A method of sentence similarity computing based on Hownet. Computer Engineering and Science. 2012;34(2):172–175. doi: x.3969/j.issn.1007-130X.2012.02.031. [CrossRef] [Google Scholar]

10. Yifeng Ten., Lijun L., Qingsong H., Tiewei F. Inquiry on TF-IDF weight improvement algorithm in intelligent guidance system. Computer Engineering and Applications. 2017;53(4):238–243. doi: 10.3778/j.issn.1002-8331.1506-0258. [CrossRef] [Google Scholar]

11. Hai-Ling X., Xiao W., Xiao-Dong Fifty., Yan B.-P. Comparison study of internet recommendation arrangement. Periodical of Software. 2009;xx(ii):350–362. doi: 10.3724/SP.J.1001.2009.03388. [CrossRef] [Google Scholar]

12. Huang C.-G., Yin J., Wang J., Liu Y.-B., Wang J.-H. Uncertain Neighbors'Collaborative filtering recommendation algorithm. Chinese Journal of Computers. 2010;33(8):1369–1377. doi: 10.3724/SP.J.1016.2010.01369. [CrossRef] [Google Scholar]

13. Liang Z., Na Z. Improved collaborative filtering algorithm. Computer Systems & Applications. 2016;25(vii):147–150. doi: 10.15888/j.cnki.csa.005224. [CrossRef] [Google Scholar]

14. Mingming J. Incorporate Topic Model into Collaborative Filtering. Beijing: Beijing Insititute of Applied science; 2016. [Google Scholar]

15. Wu Y., Rui T., Ling Fifty. News recommendation method by fusion of content-based recommendation and collaborative filtering. Journal of Computer Applications. 2016;36(2):414–418. doi: 10.11772/j.issn.1001-9081.2016.02.0414. [CrossRef] [Google Scholar]

16. López-Nores Thousand., Blanco-Fernández Y., Pazos-Arias J. J., Gil-Solla A. Holding-based collaborative filtering for wellness-aware recommender systems. Expert Systems with Applications. 2012;39(viii):7451–7457. doi: 10.1016/j.eswa.2012.01.112. [CrossRef] [Google Scholar]

17. Surong Y., Xiaoqing F., Yixing 50. Matrix factorization based social recommender model. Journal of Tsinghua University(Science and Technology) 2016;56(vii):793–800. doi: ten.16511/j.cnki.qhdxxb.2016.21.045. [CrossRef] [Google Scholar]

eighteen. Bing F., Xiaoting N. Tag-based matrix factorization recommendation algorithm. Application Research of Computers. 2017;34(4):1021–1025. doi: 10.3969/j.issn.1001-3695.2017.04.015. [CrossRef] [Google Scholar]

19. Huang Z. X., Lu 10. D., Duan H. L., Zhao C. Collaboration-based medical knowledge recommendation. Artificial Intelligence in Medicine. 2012;55(i):xiii–24. doi: 10.1016/j.artmed.2011.ten.002. [PubMed] [CrossRef] [Google Scholar]

20. Kim J. H., Lee D. S., Chung Yard. Y. Item recommendation based on context-enlightened model for personalized u-healthcare service. Multimedia Tools and Applications. 2014;71(2):855–872. doi: 10.1007/s11042-011-0920-0. [CrossRef] [Google Scholar]

21. Deshpande G., Karypis Yard. Item-based top-Due north recom- mendation algorithms. ACM Transactions on Informa- tion Systems. 2014;22(1):143–177. [Google Scholar]

23. Hu B. Southward., Feng D., Cao W. C., LQ F., JH Grand. Mobile intelligent affliction diagnosis system based on Bayesian assay. Journal of Figurer Applications. 2008;28(6):fifteen–17. [Google Scholar]

24. Shoukun X., Weiwei W. Balance recommendation algorthm for medical resources based on semantic. Computer Engineering. 2015;41(nine):74–79. doi: 10.3969/j.issn.1000-3428.2015.09.013. [CrossRef] [Google Scholar]

25. Yan Z., Shiyao L., Can Z. An improved recommendation algorithm for mobile health care system. Journal of University of Chinese Academy of Sciences. 2016;34(1):112–118. doi: 10.7523/j.issn.2095-6134.2017.01.015. [CrossRef] [Google Scholar]

26. Jiang M. G., Song D. G., Liao Fifty. J., Zhu F. A Bayesian rec-ommender model for user rating and review profiling. Tsnghua Science and Technology. 2015;20(6):634–643. doi: 10.1109/TST.2015.7350016. [CrossRef] [Google Scholar]

28. Xiang-Wu One thousand., Shu-Dong 50., Yu-Jie Z., Xun H. Research on social recommender systems. Journal of Software. 2015;26(6):1356–1372. doi: ten.13328/j.cnki.jos.004831. [CrossRef] [Google Scholar]

thirty. Yang West., Yong Z., Zhendong L., Guanci Y. Rating prediction algorithm based on semantic similarity and matrix factorization. Journal of Computer Applications. 2017;37(Supplement 1):287–291. [Google Scholar]

31. Xiaoyu F., Yongxiang D., Pengwei Z., Xiao Z. Report for the structure method of scientist profile with multi source information fusion. Library and Information Service. 2018;62(fifteen):31–40. doi: x.13266/j.issn.0252-3116.2018.15.004. [CrossRef] [Google Scholar]

32. Qiu Ten., Zhang Q., Huang X. Fudan NLP:a toolkit for Chinese natural language processing. Proceedings of the coming together of the Association for Computational Linguistics: organization demonstrations; 2013; Sofia: the Association for Computational Linguistics. pp. 49–54. [Google Scholar]

33. Zhou L., Zhang D. NLPIR: A theoretical framework for applying natural language processing to information retrieval. Journal of the American Society for Information science and Technology. 2003;54(ii):115–123. doi: 10.1002/asi.10193. [CrossRef] [Google Scholar]

34. Ting L., Wanxiang C., Zhenghua Fifty. Linguistic communication technology platform. Periodical of Chinese Data Processing. 2011;25(half dozen):53–62. doi: 10.3969/j.issn.1003-0077.2011.06.008. [CrossRef] [Google Scholar]

35. Wang K. User information extraction and assay big data environment. Beijing: Beijing University of Posts and Telecommunications; 2018. [Google Scholar]

Articles from BioMed Research International are provided here courtesy of Hindawi Limited

wallacecoulte.blogspot.com

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8379386/