The Forefront of Medical and Drug Discovery Data Scientist Training #AJ branch-off edition

Acaric Journal

“AJ branch-off edition” is a category of articles which past published in “Acaric Journal”, a career magazine for graduate students and researchers published by Acaric Co., Ltd., or fresh articles only available on the web. This time, we bring you the articles from vol.1.

With artificial intelligence (AI) development gaining momentum worldwide, there is a need to train data scientists who can handle vast amounts of data in fields such as medicine and drug discovery. In this article, we interviewed Dr. Katsuyuki Takeuchi and Dr. Takeshi Hase of Tokyo Medical and Dental University on the topic of “The Forefront of Medical and Drug Discovery Data Scientist Training.”


Prof. Katsuyuki Takeuchi

Prof. Katsuyuki Takeuchi is a Professor at Tokyo Medical and Dental University and an Invited Professor at the Center for Mathematical and Computing Sciences, Osaka University. He is currently engaged in data science education in the medical and drug discovery fields, as well as entrepreneurship education and industry-academia collaboration education. He is also involved in career development of young researchers such as graduate students and postdoctoral fellows.

Dr. Takeshi Hase

Dr. Takeshi Hase is specially appointed Associate Professor of Department of Innovation Human Resource Development, Integrated Education Organization, Tokyo Medical and Dental University, Visiting Associate Professor of Faculty of Pharmacy, Keio University, Senior Scientist, SBX Corporation, Visiting Researcher of Institute for Systems Biology, a non-profit organization, Part-time Researcher of Research Department, NPO Institute of Next Generation Medicine. As a bioinformatician, he is involved in research on drug discovery using AI and big data.

– The demand for data science personnel is increasing in various fields, but is it also increasing in the medical and drug discovery fields?

Prof. Takeuchi: There is a growing demand for data science professionals in both academia (universities and public institutions) and industry. However, it is not easy to secure human resources who are well versed in data science. For example, there are cases where researchers in the field of data science who have been active in academia are now moving to industry. On the other hand, in the industrial world, there seems to be a flow of human resources from major domestic companies to venture companies and foreign companies, with “non-standardized” salary systems. Also, I have heard from pharmaceutical companies that it is becoming difficult to recruit graduate students with expertise in data science.

Dr.Hase: There are also examples of university faculty members and postdoctoral fellows who have conducted research in academia and have moved to large corporations and venture companies to work in areas such as AI development. Not only post-doctoral fellows and non-tenured university faculty members, but also project leaders from national research institutes moving to industry from academia.

Prof. Takeuchi: The entire human resources specialized in data science are becoming more and more fluid. In particular, I feel that the movement from academia to industry is becoming more pronounced. What is the motivation behind this trend?

Dr. Hase: I think it is because they can do research in a more favorable environment than staying in academia. For example, if they move to a civilian company, they don’t need to worry about their tenure. Some of the companies have much money for research. It is also possible to get a better salary than in academia.

Prof. Takeuchi: This is why there is a trend of data science specialists flowing from academia to industry. In some graduate schools of engineering such as informatics, many of the graduate students who have completed the master’s course have gone to work in the private sector, and as a result, the number of students in the doctoral course has fallen below capacity.

In the context of medicine and drug discovery, what kind of efforts are being made to develop human resources in data science?

Prof. Takeuchi: For example, in the new research field of AI drug discovery, there are researchers scattered among graduate schools of information technology, engineering, medicine, pharmacy, etc. However, we have yet to establish majors or courses, not only in AI drug discovery but also in data science in the medical field. On the other hand, Shiga University and Yokohama City University have established faculties for data science education, although they do not specialize in the fields of medicine and drug discovery.

 On the other hand, the Medical and Drug Discovery Data Science Consortium (MD-DSC), which is not a graduate school major but is one of the human resource development activities of the Tokyo Medical and Dental University as a representative institution, is working to develop data science specialists who will be active in the medical, drug discovery, and healthcare fields. The MD-DSC is part of the Data-Related Human Resource Development Program (MEXT), a consortium of universities, companies, and public institutions that provides a variety of curriculums. In FY2020, 70 students are enrolled in the “Doctoral Human Resource Course” for graduate students and postdocs, and 30 students are enrolled in the “Corporate Human Resource Course” for companies. The number of students who wish to take the course is increasing every year, and we feel that there is a growing need in society.

[Reference] Medical and Drug Discovery Data Science Consortium

– I understand that MD-DSC has two different courses, the “Doctoral Talent Course” and the “Corporate Talent Course”, but are the contents of both courses the same?

Prof. Takeuchi: The lecture and practical subjects are common to both the Doctoral Talent Course and the Corporate Talent Course, but the requirements for completion are different for each course. For example, one of the requirements for completion of the Doctoral Talent Course is a corporate internship. Instead, in the Corporate Talent Course, you can choose training conducted by universities or public institutions. Some trainings deal with real world data and provide valuable experience. Since last year, we have opened up some of the training programs to the Doctoral Talent Course as well.

In recent years, there has been a trend among Japanese pharmaceutical companies to strengthen “in silico drug discovery”, but is the “molecular science & information technology” approach also active in education and research in academia?

Dr. Hase: At universities, most of the work is done on a laboratory basis, but some laboratories are trying to create a pipeline for drug discovery AI in partnership with companies.

 In this context, the Japan Agency for Medical Research and Development (AMED) has allocated a large budget for the “Development of Next-Generation Drug Discovery AI through Industry-Academia Collaboration” as one of its projects to improve the efficiency of drug discovery support*. The aim of this project is to develop a comprehensive AI drug discovery platform that can share the data of each participating company. It can be said that the project is active in that the AI drug discovery efforts that have been carried out by individual laboratories are now connected under the AMED project and can work in an organized manner.

*Development of Next Generation Drug Discovery AI through Industry-Academia Collaboration (DAIIA)

– Is there a clear difference between a “data scientist specializing in medical data” and a “bioinformatician”? How can we become these?

Dr. Hase: “Data scientist specializing in medical data” and “bioinformatician” are similar in terms of the methods used (statistical analysis, machine learning), but the direction in which they are used is different. For example, a data scientist who specializes in medical data uses statistical analysis and machine learning on data from electronic medical records and patient data, while a bioinformatician analyzes omics data, such as gene expression data and metabolite data.

 In terms of how to become a data scientist or bioinformatician, there is no single standard training in this area. There is no uniform training in this regard, as even those who have studied statistical analysis and machine learning in the Master’s program will need to learn different domain knowledge depending on the lab they are working in. It is also possible to change the type of data you work with in the course of your research.

Translated with (free version) and Acaric Journal Editorial Board

  • AI drug discovery : drug discovery using AI technology
  • in silico : computer-aided methods
  • Omics data : data that comprehensively measures information about the minions of a living organism.
  • domain knowledge : knowledge specific to a certain field of expertise or industry or sector