Journal2025-07-25T13:43:09+00:00

First, do no harm.

1,500+ Posts…

Free knowledge sharing for Safe AI. Not for profit. Linkouts to sources provided. Ads are likely to appear on link-outs (zero benefit to this journal publisher)

Frontier Model Forum

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Frontier Model Forum What the Frontier Model Forum will do Governments and industry agree that, while advanced AI offers tremendous promise to benefit the world, appropriate guardrails are required to mitigate risks. [...]

REPORT. DecodingTrust. Comprehensive Assessment of Trustworthiness in GPT Models.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. DecodingTrust. Comprehensive Assessment of Trustworthiness in GPT Models. What is DecodingTrust? DecodingTrust aims at providing a thorough assessment of trustworthiness in GPT models. This research endeavor is designed to help researchers and [...]

OpenAI Finally Allows ChatGPT Complete Internet Access. Those paying for OpenAI’s chatbot are now able to use Bing to give ChatGPT the latest information. Plus, DALL-E 3 integration is rolling out in beta. October 18, 2023.

GIZMODO. OpenAI Finally Allows ChatGPT Complete Internet Access. (OOPS?) Those paying for OpenAI’s chatbot are now able to use Bing to give ChatGPT the latest information. Plus, DALL-E 3 integration is rolling out in beta. By Kyle Barr Published October 18, 2023 OpenAI’s world-famous chatbot is free to rummage through [...]

OpenAI Alignment Team Lead: Jan Leike ‘s P(doom) is, like, 10-90%.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. OpenAI Alignment Team Lead: Jan Leike 's P(doom) is, like, 10-90%. OpenAI Alignment Team Lead @JanLeike's P(doom) is 10-90%. pic.twitter.com/CFuyQxQ0hf — Liron Shapira (@liron) August [...]

AI Safety Weekly. AI Tracker #6: Russian Roulette, Anyone?

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. AI Tracker #6: Russian Roulette, Anyone?A lot of high-quality AI safety conversation happens on Twitter. Unfortunately, this leads to great write-ups & content being buried in days. We've set up AI Tracker to [...]

Existential Risk Observatory. Does interpretability change everything? 07 OCT 2023.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Existential Risk Observatory. Does interpretability change everything? 07 OCT 2023. Significant news: two breakthroughs in interpretability, a subfield of AI Alignment, came out this week. What is AI Alignment again? AI Alignment [...]

THE WASHINGTON POST. ChatGPT provided better customer service than his staff. He fired them. Artificial intelligence is rapidly changing the world of customer service and call centers. Developing economies worry they’ll face the brunt. October 3, 2023.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. ChatGPT provided better customer service than his staff. He fired them. Artificial intelligence is rapidly changing the world of customer service and call centers. Developing economies worry they’ll face the brunt. By [...]

REPORT. Low-Resource Languages Jailbreak GPT-4. 03 OCT 2023.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. Low-Resource Languages Jailbreak GPT-4 Abstract AI safety training and red-teaming of large language models (LLMs) are measures to mitigate the generation of unsafe content. Our work exposes the inherent cross-lingual vulnerability of [...]

Future of Life Institute. REGULATE AI NOW.

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER. "Lights out for all of us." [Translation: Death for all humanity.] TRANSCRIPT. In March 2023 an open letter sounded the alarm on the training of giant [...]

Load More Posts
Go to Top