資料科學工程師 / Data Scientist Engineer

薪資範圍:60,000 ~ 80,000 TWD / month

公司名稱: 萬達人工智慧科技股份有限公司

1. 處理語言數據的清洗、擴充及生成,開發和優化語言數據處理管道以支持多語言數據的預處理與增強,包括中文、英文(優先考量)和日文。
2. 熟悉 熱門資料庫、爬蟲技術以及 資料搜索與版本控制,具備使用 Git 進行數據管理和追踪的經驗。
3. 熟練掌握 正則表達式 (Regex)、Python 語言處理庫(例如 NLTK、cutlet 或 m-pinyin)以及多語言文本處理工具,擅長處理語言特定的數據清理、轉換及分析。
4. 熟悉數據增強與格式轉換,具備設計和實現大規模語言數據管道的能力,經驗包括使用 parquet 格式進行高效數據儲存與處理。
5. 具有良好的 Prompt Engineering 經驗,熟悉構建高效提示語以支持大型語言模型 (LLMs) 的運行與優化,包括 prompt 格式設計與微調。
6. 擁有扎實的數據分析能力,熟悉統計方法與語言數據特性,能通過自動化工具生成深度見解,並能結合爬取的多語言資料進行分析與評估。
7. 熟悉數據版本化與數據質量控制,能夠追踪數據的歷史變更,並確保數據處理過程的透明性與可追溯性。
8. 熟悉基礎的機器學習工作流,了解如何準備數據集以支持 LLM 的預訓練(Prompting, ),並對於構建向量相似性搜索與語義數據匹配有基本理解。
9. 掌握多語言數據爬取與清理的實踐經驗,能夠設計高效的抓取管道來收集和整理分佈式數據源。
+++++ 主動積極,喜歡提出解決方案和改進建議的團隊成員將特別受歡迎!我們期待與您一起解決多語言數據挑戰!+++++

Data Scientist Job Description (English Version)
1. Process, clean, augment, and generate language data, developing and optimizing multilingual data pipelines for preprocessing and augmentation, with a focus on Chinese, English, and Japanese (preferred).
2. Familiar with popular databases, web scraping techniques, and data searching and version control, with experience using Git for data management and tracking.
3. Proficient in regex (regular expressions), Python language processing libraries (e.g., NLTK, cutlet, or m-pinyin), and multilingual text processing tools, with expertise in cleaning, transforming, and analyzing language-specific data.
4. Skilled in data augmentation and format transformation, with the ability to design and implement large-scale language data pipelines, including efficient storage and processing using parquet format.
5. Experience with Prompt Engineering, designing and fine-tuning prompts to support large language models (LLMs), including crafting effective prompt formats.
6. Strong data analysis skills, familiar with statistical methods and language data characteristics, capable of generating deep insights through automation tools and analyzing multilingual datasets.
7. Knowledgeable in data versioning and quality control, able to track data history and ensure transparency and traceability in the data processing pipeline.
8. Familiar with basic machine learning workflows and preparing datasets to support LLM pretraining (not fine-tuning), with a fundamental understanding of building vector similarity searches and semantic data matching.
9. Practical experience in multilingual data crawling and cleaning, capable of designing efficient pipelines to gather and structure distributed data sources.
+++++ We value proactive team members who suggest solutions and improvements! Join us in tackling multilingual data challenges!+++++

擅長工具
•UNIX
•Git
•ASP.NET
•C++
•Python
•Visual Studio .net
•Prompt
•Excel
•Data Architect
•Data Marts
•Hive
•SAS
•SPSS

工作技能
•軟體程式設計
•模組化系統設計
•Machine Learning
•資料庫系統管理維護
•資料庫程式設計
•資料庫軟體應用

其他條件
1. 考慮英文+中文能力, 日文加分
2.公司使用A100、H100大量資料訓練資料處理。
3. 對於開發個性化的AI解決方案,情感陪伴,包括陪伴型遊戲、互動角色和養成型遊戲。熱愛動漫、角色以及IP行業的候選人優先考慮。

We collaborate with large gaming corporations and anime IPs to develop personalized AI solutions, including companionship games, interactive characters, and nurturing-style games. Candidates with a passion for anime, characters, and the IP industry will be an integral part of our team. Our work focuses on integrating AI into the anime, manga, and gaming industries, creating unforgettable experiences for global users.


**應徵者需接受公司測試**

公司地址:

民生東路三段109號10樓D區

其他:

萬達是人工智慧(AI)陪伴領域的開拓者,我們正在尋找熱衷於創造未來人工智慧與人類互動的工程師,並且具備人工智慧與機器學習技能。 我們的使命是將AI友伴無縫整合到日常生活和各行各業中,包括手機遊戲、客戶零售資訊亭,以及在旅遊、醫療保健、設計、內容創作、製造、零售和教育等領域,通過XR媒介擴展企業應用程式。我們不僅是在創造技術,更是在建立深層次的聯繫。 我們的AI友伴設計具有人類般的個性、回應能力、主動性和知識,使其有別於傳統AI。萬達的AI服務涵蓋包括台灣、中國、日本、美國在內的精英客戶,涉及的行業橫跨遊戲、零售、製造和醫療等亞洲前20名的企業。萬達擁有最先進的資源,包括A100和H100 GPU,由30名專業人士組成的團隊提供支持。我們正在尋找那些與我們有共同願景的工作夥伴,他們將人工智慧視為伙伴,而非僅僅是工具,與我們一起將人工智慧轉變為跨媒介的合作夥伴。Wonders.ai 是AI陪伴領域的領導者,在2024年ChinaJoy中榮獲最佳AI陪伴產品獎。我們的市場涵蓋動畫、漫畫、遊戲以及美國、日本、台灣、中國等地區。Wonders.ai 與全球夥伴合作,正在打造下一個全球性AI陪伴應用的巔峰之作。公司已經為今年年底的全球產品發佈做好準備,現正擴大團隊規模。我們正在尋找對創新充滿熱情,並願意全身心投入的個人,與我們一起實現3D角色與人類即時互動的未來。這是一個讓你充分發揮創造力並獲得豐厚回報的絕佳機會。如果你渴望參與這項具有巨大潛力和挑戰性的工作,請立即訪問我們的網站 [www.wonders.ai/news],了解更多公司成就與案例。1. (https://www.forbes.com/sites/rodberger/2023/02/17/new-wonderverse-pushing-boundaries-of-reality/?sh=5d4cab5c7bc4)2. (https://gritdaily.com/wonders-ai-revolutionizing-ar-vr/)---Wonders.ai is a pioneer in AI companionship, and we are seeking engineers who are passionate about shaping the future of human-AI interaction and possess strong skills in AI and machine learning. Our mission is to seamlessly integrate AI companions into everyday life and various industries—including mobile games, customer retail kiosks, and enterprise applications that expand across XR media in fields such as tourism, healthcare, design, content creation, manufacturing, retail, and education. We are not just creating technology; we are building connections. Our AI companions are designed with human-like personalities, responsiveness, initiative, and knowledge, setting them apart from traditional AI.Wonders.ai serves elite clients across Taiwan, China, Japan, and the United States, with industries ranging from gaming and retail to manufacturing and healthcare, including Asia's top 20 companies. We are equipped with cutting-edge resources, including A100 and H100 GPUs, supported by a team of 30 professionals. We are looking for like-minded partners who view AI not just as a tool but as a companion, to join us on this journey of transforming AI into cross-media partnerships.Wonders.ai is a leader in AI companionship, having won the Best AI Companionship Product award at 2024 ChinaJoy. Our market spans animation, comics, games, and regions including the United States, Japan, and China. Wonders.ai collaborates with global partners and is building the next global AI companionship application. The company is gearing up for a global product launch at the end of this year and is expanding its team.We are looking for individuals who are passionate about innovation and willing to fully dedicate themselves to realizing the future of real-time interaction between 3D characters and humans. This is an excellent opportunity to unleash your creativity and receive substantial rewards. If you are eager to be part of this high-potential, challenging endeavor, please visit our website [www.wonders.ai/news] to learn more about our achievements and case studies.-2025-01-14
應徵