OpenAI, the company whose mission is: to build a safe and beneficial AGI, has released a report: AI and covert influence operations: latest trends
It seems it is the first of a series of report to show they combat the abuse of their platform. Few notes:
Attacker trends
- Content generation: All of the actors described in this report used our models to generate content (primarily text, occasionally images such as cartoons). Some appear to have done so to improve the quality of their output, generating texts with fewer language errors than would have been possible for human operators. Others appeared more focused on quantity, generating large volumes of short comments that were then posted on third-party platforms.
- Mixing old and new: All of these operations used AI to some degree, but none used it exclusively. Instead, AI-generated material was just one of many types of content they posted, alongside more traditional formats, such as manually written texts, or memes copied from across the internet.
- Faking engagement: Some of the campaigns we disrupted used our models to create the appearance of engagement across social media - for example, by generating replies to their own posts to create false online engagement, which is against our Usage Policies. This is distinct from attracting authentic engagement, which none of the networks described here managed to do.
- Productivity gains: Many of the threat actors that we identified and disrupted used our models in an attempt to enhance productivity. This included uses that would be banal if they had not been put to the service of deceptive networks, such as asking for translations and converting double quotes to single quotes in lists.
Defender trends
- Defensive design: Our models are designed to impose friction on threat actors. We have built them with defense in mind: for example, our latest image generation model, DALL-E 3, has mitigations to decline requests that ask for a public figure by name, and we’ve worked with red teamers—domain experts who stress-test our models and services—to help inform our risk assessment and mitigation efforts in areas like deceptive messaging. We have seen where operators like Doppelganger tried to generate images of European politicians, only to be refused by the model.
- AI for defenders: Throughout our investigations, we have built and used our own AI-powered models to make our detection and analysis faster and more effective. AI allows analysts to assess larger volumes of data at greater speeds, refine code and queries, and work across many more languages effectively. By leveraging our models’ capabilities to synthesize and analyze the ways threat actors use those models at scale and in many languages, we have drastically improved the analytical capabilities of our investigative teams, reducing some workflows from hours or days to a few minutes. As our models improve, we’ll continue leveraging their capabilities to improve our investigations too.
Case studies:
- Bad Grammar: Unreported Russian threat actor posting political comments in English and Russian on Telegram
- Doppelganger: Persistent Russian threat actor posting anti-Ukraine content across the internet
- Spamouflage: Persistent Chinese threat actor posting content across the internet to praise China and criticize its critics
- International Union of Virtual Media (IUVM): Persistent Iranian threat actor generating pro-Iran, anti-Israel and anti-US website content
- Zero Zeno: For-hire Israeli threat actor posting anti-Hamas, anti-Qatar, pro-Israel, anti-BJP, and pro-Histadrut content across the internet.
IO: (Covert) Influence Operations
No hay comentarios:
Publicar un comentario
Trata a los demás como te gustaría ser tratado.