OceanGPT
A Large Language Model for Ocean Science Tasks.
The Potential of OceanGPT
1
How was OceanGPT trained?
Data quality is crucial for training domain large language models. To train OceanGPT, we collected an ocean science corpus that spans multiple fields. Since each subfield and topic has its unique data characteristics and patterns, we proposed a domain-specific instruction generation framework called DoInstruct. This framework employs a multi-agent collaborative approach to generate instruction data. We trained OceanGPT based on open-source models (such as Qwen, LLaMA, MiniCPM, etc.) and instructions generated by the DoInstruct framework.
2
What can OceanGPT be used for?
OceanGPT can act as a domain expert in ocean science tasks, such as generating effective solutions for key issues like ocean pollution protection. It also shows potential in underwater embodied intelligence tasks. During the fine-tuning process of OceanGPT, we integrated symbolic robotic simulation control commands and code. We validated its ability to control underwater robots (such as trajectory planning) in a simulator.
3
How to download and use OceanGPT?
We have released Oceangpt (2B, 7B, 14B) at HuggingFace, ModelScope and WiseModel: