2026 Internship - Institute of Statistical Science, Academia Sinica

2026 研究主題清單 (Research List)

Updated 2026.03.02(持續更新中)

主持人(PI)	研究主題(Research Topic)	研究介紹(Introduction)	參考網頁
潘建興 Frederick Kin Hing Phoa	現代實驗設計和分析 Design and Analysis of Modern Experiments 大型網路資料分析 Analysis of Large-Scale Network Data 最佳化程序方法和其應用 Optimization: Methods and Applications 地震,環境和交通大數據分析 Big Data Analytics in Seismicity, Environments and Transportation 數位孿生在人工智慧與智慧製造的應用 Applications of Digital Twins in Artificial Intelligence and Smart Manufacturing	介紹：我的研究小組旨在為有興趣應用統計和數據分析技術來解決尖端現實問題的學生提供第一手經驗的暑期實習機會。下列包括一些適合學生在兩個月內展開並獲得足夠研究成果的潛在主題。 1.現代實驗設計與分析： (a) 設計臨床試驗的實驗。(b) 設計網路調查或廣告實驗。(c) 設計人工智慧和機器人測試實驗。(d) 分析生物醫學診斷的實驗數據。(e) 分析體育運動中測試實驗的數據。 2.大規模網路數據分析： (a) 研究真實網絡的網絡演化機制。(b) 在大規模網路中有效地偵測社群。(c) 從世界上最大的科學網絡分析引文和科研合作的行為。(d) 從大型語言模型中開發自動主題識別和內容摘要。(e) 在大規模網路中定位關鍵節點。 3.最佳化程序方法和其應用： (a) 開發新的混合最佳化技術。(b) 優化物流業配送路線。(c) 優化超級電腦設施的任務調度。(d) 開發高效率的翻譯器來解碼加密訊息。(e) 在給定的平面圖中有效地分佈監視系統。 4.地震,環境和交通大數據分析 (a) 透過網路時間序列聚類分析空氣污染(PM 2.5)空氣箱大數據。(b) 透過網絡時間序列聚類分析地下水變化大數據。(c) 透過分散式聲學感測資料分析台灣地震。(d) 最佳化包含可能人為延誤的火車/公車時刻表。(e) 優化共享單車配送路線和站點分配。 5.數位孿生在人工智慧與智慧製造的應用 (a) 發展半監督大數據的最佳子取樣技術。(b) 為智慧工廠設計監測系統。(c) 量化數位孿生的不確定性和變化。本次實習機會開放給各層級的學生（學士、碩士、博士），工作內容依參與學生的程度進行調整。在這兩個月內，學生將與我進行文獻討論，然後在我的指導下開始研究主題。成果優良的學生會收到繼續參與研究的邀請，並會以產生足夠結果來在國際期刊上發表的為最終目標。我們鼓勵所有有興趣參加本研究小組暑期實習的學生以電子郵件方式(fredphoa@stat.sinica.edu.tw)向我獲取進一步的說明和詢問。 Introduction: The summer internship in my research group aims at providing first-hand experience to students who are interested in applying statistical and data analytics techniques to solve cutting-edge real-world problems. The following list includes some potential topics suitable for students to work on and obtain adequate research results within two months. 1. Design and Analysis of Experiments: (a) designing experiments for clinical trials. (b) designing experiments for network surveys or advertisement. (c) designing experiments for AI and robotics testing.(d) analyzing experimental data from biomedical diagnosis.(e) analyzing experimental test data from sports. 2. Analysis of Large-Scale Network Data: (a) studying the network evolution mechanism for real networks. (b) detecting communities efficiently in large-scale networks. (c) analyzing citation and collaboration behaviors from the world''s largest scientific database. (d) developing automatic topic identification and content summarization from large language model.(e) detecting essential nodes in large-scale networks. 3. Optimization: Methods and Application: (a) developing new hybrid optimization techniques. (b) optimizing delivery routes for logistic industry. (c) optimizing task schedule in supercomputer facility. (d) developing an efficient translator for decoding encrypted messages.(e) developing an efficient surveillance system in a given floor plan. 4. Big Data Analytics in Seismicity, Environments and Transportation (a) analyzing air pollution (PM 2.5) big airbox data by network time series clustering. (b) analyzing underground water variation by network time series clustering. (c) analyzing earthquakes in Taiwan by Distributed Acoustic Sensing (DAS) data. (d) optimizing train/bus schedule with potential human delay.(e) optimizing share-bike delivery route and stop allocations. 5. Applications of Digital Twins in Artificial Intelligence and Smart Manufacturing (a) developing optimal subsampling techniques for semi-supervised big data. (b) designing surveillance systems for smart factory. (c) quantifying the uncertainty and variation in digital twins. This internship is open to students at all levels (bachelor, master, doctoral), and the work content is adjustable according to the level of the participating students. Within these two months, the students are expected to have literature discussions with me, then start to work on the research topic under my supervision. Students with excellent results will receive an invitation to continue participating in research, with the ultimate goal of producing sufficient results to be published in an international journal. All students who are interested in participating in my research group''s summer internship are encouraged to email me (fredphoa@stat.sinica.edu.tw) for further clarifications and enquiries.	研究人員網頁 (PI's Page): https://staff.sinica.edu.tw/fredphoa/index.html Email: fredphoa@stat.sinica.edu.tw
張明中 Ming-Chung Chang	多階層因子設計的理論探討 A Theoretical Exploration of Multi-Stratum Factorial Designs	多階層因子設計廣泛應用於處理實驗單位中的複雜異質性。在這個暑期實習計畫中，學生將首先學習多階層因子設計的統一理論。接下來，他們將探索此類設計的理論性質，包括擴展現有理論以開發能適應實驗單位中一般異質性結構的更全面框架。此外，學生還將深入了解這些設計的各種應用，涵蓋從傳統實驗設計到最前沿的人工智慧應用。 [SPEC研究亮點]: https://spec.ntu.edu.tw/research/research-detail286 Multi-stratum factorial designs are widely used to address complex heterogeneity in experimental units. In this summer internship program, students will first be introduced to a unified theory of multi-stratum designs. Following this, they will explore the theoretical properties of such designs, including extending existing theories to develop more comprehensive frameworks that accommodate general heterogeneity structures in experimental units. Additionally, students will gain insight into various applications of these designs, ranging from traditional experimental setups to cutting-edge AI-driven applications.	研究人員網頁 (PI's Page): https://sites.google.com/view/mcchang/ Email: mcchang0131@as.edu.tw
陳定立 Ting-Li Chen	棒球資料分析 Baseball Data Analytics	棒球資料分析本研究主題聚焦於棒球運動中的資料分析與統計方法，主要以美國職棒大聯盟（MLB）之公開資料為研究基礎，進行打擊結果、防守表現、球場效果等相關問題之統計分析與建模。研究內容涵蓋資料整理、探索性資料分析、統計模型建構與結果詮釋，讓學生實際接觸真實且具規模的運動資料，並理解資料背後所反映的結構性特徵。此外，研究團隊亦與國內職業棒球球團洽談資料合作，未來將進一步分析球團內部資料，讓學生了解統計分析在實務決策中的應用。本研究計畫歡迎對棒球運動有高度興趣，具備基礎數學與統計背景，並能進行程式撰寫與資料處理（如R 或 Python）之在學學生申請；在條件相近的情況下，將以大學生為優先考量。錄取之學生將被要求於研習開始前，先行熟悉相關棒球資料與研究主題，以利研習期間之研究進行。另歡迎對本研究主題有高度興趣之同學，可於申請前先行來信與主持人聯繫，進一步了解研究內容與方向是否合適；未能錄取本次暑期實習but仍有興趣者，亦歡迎洽談以兼任研究助理方式參與研究。實際錄取仍依本所系統媒合結果為準。 Baseball Data Analytics This research topic focuses on data analytics and statistical methods in baseball. The primary data source consists of publicly available Major League Baseball (MLB) datasets, which will be used to study problems such as batting outcomes, defensive performance, and ballpark effects through statistical analysis and modeling. The project covers data preprocessing, exploratory data analysis, statistical model construction, and interpretation of results, allowing students to work hands-on with real-world sports data and to understand the structural features underlying observed outcomes. In addition, the research team is in discussion with a domestic professional baseball organization regarding potential data collaboration. Future work will involve the analysis of team-internal data, providing students with exposure to the application of statistical analysis in real-world decision-making contexts.Applicants who have a strong interest in baseball, a background in mathematics and statistics, and the ability to perform data processing and programming (e.g., using R or Python) are encouraged to apply. When qualifications are comparable, undergraduate students will be given priority. Students admitted to the program will be expected to begin familiarizing themselves with baseball data and related research topics prior to the start of the internship. Prospective applicants with strong interest in this topic are also welcome to contact the principal investigator before applying to discuss research directions and mutual fit. Those not selected for the summer internship but still interested in participating in the research are welcome to inquire about opportunities to work as part-time research assistants. Final admission decisions will be made through the institute’s official matching system.	研究人員網頁 (PI's Page): https://staff.stat.sinica.edu.tw/tlchen/ Email: tlchen@stat.sinica.edu.tw
陳璿宇 Hsuan-Yu Chen	以 AI 與資料科學推進癌症治療的實證研究 Evidence-Based Research Advancing Cancer Therapy through AI and Data Science	癌症治療正快速進入以資料與人工智慧驅動的新階段。如何將多體學大數據與 AI 模型，真正轉化為可驗證、可應用於臨床的新型治療策略，仍是當前生醫研究的核心挑戰。中央研究院統計科學研究所陳璿宇教授（Dr. Hsuan-Yu Chen）研究團隊，長期致力於轉譯數據科學、癌症多體學與 AI 驅動藥物開發。本暑期研究計畫將提供學生實際參與跨領域研究的機會，從資料分析到生醫實驗，深入理解 AI 如何實際影響癌症治療決策與新藥研發流程。實驗室主持人: 陳璿宇博士（Dr. Hsuan-Yu Chen）中央研究院統計科學研究所研究員國立臺灣大學／中興大學／高雄醫學大學合聘教授台灣精準健康暨毒理基因體學會理事長美國與台灣 Cancer Moonshot（癌症登月計畫）相關國際研究團隊研究專長：轉譯數據科學、癌症多體學、AI 驅動精準醫療研究核心方向: 一、資料科學與精準醫療本團隊整合基因體、轉錄體與蛋白質體等高維度多體學資料，結合臨床大數據，建立可解釋、可應用的疾病風險預測與治療反應模型，相關研究成果已實際推進至臨床研究與轉譯應用。二、生成式 AI 與新藥設計研究團隊發展以 Generative AI 為核心的分子設計方法，建構可驗證的「分子設計智慧代理平台」，並結合細胞實驗、動物模型與質譜分析，形成完整的「預測—驗證—回饋（Dry–Wet Loop）」研究架構，目標為建立可擴充的 AI 藥物設計流程，挑戰傳統方法難以處理的癌症標靶。研究環境與團隊特色: 跨領域研究團隊：成員涵蓋臨床醫師、統計學者、資工／電機工程師與生物學研究人員研究量能與指導制度完善：博士生 7 人博士後研究員 3 人碩士級研究人員 7 人研究生與實習成員可在明確分工與導師制度下，實際參與核心研究工作適合對象: 具備統計、資訊工程、電機、生物醫學、生物、醫學相關背景，並對 AI、資料科學與癌症治療或藥物開發之交叉研究具有高度學習動機者。如果你希望理解 AI 不只是模型，而是如何實際參與醫學決策與治療創新，歡迎加入我們的研究團隊。 Cancer therapy is rapidly entering a new era driven by data and artificial intelligence. A central challenge in contemporary biomedical research is how to translate large-scale multi-omics data and AI models into verifiable and clinically actionable therapeutic strategies, rather than remaining at the level of computational predictions alone. The research group led by Dr. Hsuan-Yu Chen at the Institute of Statistical Science, Academia Sinica, focuses on translational data science, cancer multi-omics, and AI-enabled drug discovery. This summer research program offers students the opportunity to participate directly in interdisciplinary research, spanning data analysis and biomedical experimentation, to gain a practical understanding of how AI can meaningfully influence cancer treatment decisions and drug development pipelines. Principal Investigator: Hsuan-Yu Chen, PhD Research Fellow, Institute of Statistical Science, Academia Sinica Joint Professor, National Taiwan University / National Chung Hsing University / Kaohsiung Medical University President, Taiwan Society for Precision Health and Toxicogenomics Member of internationally recognized research teams associated with the U.S./Taiwan Cancer Moonshot Project Research Expertise: Translational Data Science, Cancer Multi-omics, AI-driven Precision Medicine Core Research Areas: 1. Data Science and Precision Medicine Our team integrates high-dimensional multi-omics data, including genomics, transcriptomics, and proteomics, with large-scale clinical datasets to develop interpretable and clinically applicable models for disease risk prediction and treatment response assessment. These efforts have progressed beyond computational studies and have been translated into ongoing clinical and translational research applications. 2. Generative AI and Drug Design We develop Generative AI–based molecular design methodologies and build a verifiable “Molecular Design Intelligent Agent Platform.” By integrating cell-based assays, animal models, and mass spectrometry, we establish a complete Prediction–Validation–Feedback (Dry–Wet Loop) research framework. The long-term goal is to create a scalable AI-driven drug design pipeline capable of addressing cancer targets that are difficult to approach using conventional methods. Research Environment and Team Strengths: Interdisciplinary team composition: clinicians, statisticians, computer science and electrical engineering researchers, and experimental biologists Strong research capacity and mentorship structure: 7 PhD students 3 postdoctoral researchers 7 master-level research staff Students and interns participate in core research activities within a clearly structured and well-supported mentoring system Ideal Candidates: Applicants with backgrounds in statistics, computer science, electrical engineering, biomedical sciences, biology, or medicine, who are strongly motivated to work at the intersection of AI, data science, cancer therapy, and drug development. If you are interested in understanding how AI functions not merely as a model, but as a practical component of medical decision-making and therapeutic innovation, we welcome you to join our research team.	研究人員網頁 (PI's Page): https://staff.stat.sinica.edu.tw/hychen/ Email: hychen@stat.sinica.edu.tw
陳君厚、楊欣洲 Chun-houh Chen & Hsin-Chou Yang	健康大數據在精準醫療與智慧健康的應用 Analysis and application of big health data in precision medicine and smart health	參與智慧健康研究團隊（陳君厚博士和楊欣洲博士聯合指導），學習統計學習、機器學習、深度學習、資料視覺化的方法與技巧，實際分析醫學影像、基因資料、環境暴露、就醫記錄、人口統計等大數據，開發資料科學與人工智慧的自動化方法，進行特徵擷取、疾病分型和診斷、病變分割與偵測、風險評估和預測、大型語言模型應用等。 During this summer internship, under the joint supervision of Drs. Chun-Houh Chen and Hsin-Chou Yang, you will join our research team, "Smart Health," to learn methods and techniques in statistical learning, machine learning, deep learning, and data visualization. You will apply these methods to the practical analysis of big health data, including medical images, genetic data, environmental exposures, medical records, and demographic statistics. Additionally, you will gain knowledge in developing automated approaches in data science and artificial intelligence, with a focus on feature extraction, disease classification and diagnosis, lesion segmentation and detection, risk assessment and prediction, as well as applications of large language models. 要求：(1) 對統計資料科學、人工智慧、精準醫療、智慧健康、基因體統計、醫學影像、資料視覺化、電腦視覺處理、生醫訊號處理、大語言模型等充滿熱情。(2) 喜歡實際資料 analysis 與研究。(3) 研究動機強，希望接受嚴格訓練。 Requirements: (1) Passion for statistical data sciences, artificial intelligence, smart health, statistical genomics, medical imaging, data visualization, computer vision, biomedical signals, and large language models; (2) Interest in real data analysis and research; (3) Highly motivated and eager to undergo rigorous training.	研究人員網頁 (PI's Page): 1. 陳君厚研究室網頁(Chun-houh Chen) https://gap.stat.sinica.edu.tw/ Email: cchen@stat.sinica.edu.tw 2. 楊欣洲研究室網頁(Hsin-Chou Yang) https://staff.stat.sinica.edu.tw/hsinchou/ Email: hsinchou@stat.sinica.edu.tw 3. 研究群網頁 Research group website： https://sites.stat.sinica.edu.tw/SH/
黃信誠 Hsin-Cheng Huang	臺灣天氣預報偏差修正與降尺度之統計及AI方法 Statistical Machine Learning for Weather Forecast Bias Correction and Downscaling	台灣地形複雜，全球數值天氣預報模式往往難以精確捕捉細尺度的降雨空間分佈與極端強度，且模式輸出常存在系統性偏差。本研究計畫旨在招募對統計科學與機器學習有興趣的同學，利用歷史觀測與預報資料，發展並改進先進的統計後處理技術。實習生將有機會接觸並應用「類比後處理法（Analog Postprocessing, AP）」與「機率配對法（Probability Matching, PM）」來修正預報位置誤差並還原真實降雨強度。除了傳統統計方法，我們也將探索更前瞻的研究方向，例如：(1) 透過局部線性回歸修正氣候變遷或極端事件下因找不到相近類比而產生的偏差；(2) 利用生成式 AI 框架建立從數值模式到觀測場的傳輸映射，生成具備空間相關性且能準確處理「降雨零值（intermittency）」的高解析度場域。參與本計畫的學生能親身體驗如何將統計理論應用於實際的氣象服務，提升台灣中長期天氣預報的經濟價值與決策效益。 Taiwan’s complex topography makes it challenging for numerical weather models to accurately capture fine-scale spatial distributions and extreme intensities of precipitation, often leading to systematic biases. This research project invites students with a strong interest in statistical science and machine learning to participate in the development and improvement of advanced statistical postprocessing techniques. Interns will have the opportunity to apply and refine methods such as Analog Postprocessing (AP) and Probability Matching (PM) to correct spatial errors and downscale low-resolution forecasts to a high-resolution grid. Beyond traditional methods, the project investigates cutting-edge frontiers, including: (1) regression adjustment, which uses local-linear KNN estimators to correct biases for rare extremes or climate-change scenarios where historical analogs are sparse; and (2) Generative AI frameworks to generate high-resolution fields that respect spatial dependencies and accurately handle precipitation intermittency (zero-rainfall values). This internship offers an opportunity to apply statistical theory to real-world meteorological services, enhancing the economic and decision-making value of weather forecasts in Taiwan.	研究人員網頁 (PI's Page): https://sites.stat.sinica.edu.tw/hchuang/ Email: hchuang@stat.sinica.edu.tw
顏佐榕 Tso-Jung Yen	基於科學資料的深度學習用於科學研究的代理型人工智慧 Deep learning from scientific data Agentic AI for scientific research	目前我們正致力於發展：1) 一套自動化的可解釋人工智慧方法，並將其應用於醫學影像分類。同時，我們也運用生成式模型來解決 2.1) 結構化矩陣生成問題，以及 2.2) 函數對函數 (function-on-function) 的預測問題。除了上述研究主題外，我們亦持續探索人工智慧代理 (AI agents) 在3.1) 科學研究上的應用，以及 3.2) 社會互動模擬上的應用。 Currently we are developing 1) an automatic approach to explainable AI with applications to medical image classification. We are also applying generative modelling to solve 2.1) structured matrix generation problems and 2.2) function-on-function prediction problems. Besides the above topics, we are also exploring the AI agents for 3.1) scientific research and 3.2) social interaction simulation.	研究人員網頁 (PI's Page): https://sites.stat.sinica.edu.tw/tjyen/ Email: tjyen@stat.sinica.edu.tw
高振宏 Chen-Hung Kao	統計遺傳學研究 Statistical Genetics	研發和應用統計方法分析遺傳資料以回答並解決與遺傳有關問題的學術研究。 Develop and apply statistical methods to analyze genetic data to answer and solve genetic questions.	研究人員網頁 (PI's Page): https://staff.stat.sinica.edu.tw/chkao/ Email: chkao@stat.sinica.edu.tw