翻訳言語

有名なo3「GeoGuessr」プロンプトは効果がなかった

OpenAIのo3モデルが写真から位置を特定するGeoGuessrタスクで驚くべき性能を示したと話題になったが、実際に「魔法のプロンプト」と呼ばれた詳細な指示文が効果を持っていたのか検証した。200枚の画像を使ったベンチマークの結果、シンプルな標準プロンプトと比較して、精巧なプロンプトは平均精度で劣っており、むしろ「プロンプトのおかげで成果が出た」と思い込む危険性を示している。モデルがすでに十分な性能を持っている場合、複雑なプロンプトを与えても効果はなく、改善の錯覚を生むだけだ。

The famous O3 "GeoGuessr" prompt did not work
3.0
The author tested OpenAI's widely-shared O3 "GeoGuessr" prompt—which was claimed to solve the game by reasoning like a human—and found it did not work as advertised. The prompt often produced incorrect guesses and failed to replicate the viral performance seen in demonstrations.

有名なo3「GeoGuessr」プロンプトは効果がなかった

関連記事

The famous O3 "GeoGuessr" prompt did not work

有名なo3「GeoGuessr」プロンプトは効果がなかった

関連記事

The famous O3 "GeoGuessr" prompt did not work